Aggregate

On the Aggregatability of Router Forwarding Tables
Xin Zhao Yaoqing Liu Lan Wang Beichuan Zhang

zhaox@email.arizona.edu yliu6@memphis.edu lanwang@memphis.edu bzhang@arizona.edu
Abstract—The rapid growth of global routing tables has raised This paper investigates the feasibility of a purely local
concerns among many Internet Service Providers. The most solution: FIB aggregation, which is to combine multiple entries
immediate concern regarding routing scalability is the size of the in the forwarding table without changing the next hops for data
Forwarding Information Base (FIB), which seems to be growing
at a faster pace than router hardware can support. This paper forwarding. This approach is particularly appealing because it
focuses on one potential solution to this problem – FIB aggrega- can be done by a software upgrade at a router and its impact
tion, i.e., aggregating FIB entries without affecting the forwarding is limited within the router. It does not require changes to
paths taken by data traffic. Compared with alternative solutions routing protocols or router hardware, nor does it affect multi-
to the routing scalability problem, FIB aggregation is particularly homing, traffic engineering, or other network-wide operations.
appealing because it is a purely local software optimization limited
within a router, requiring no changes to routing protocols or router It is important to note that FIB aggregation is not a replacement
hardware. To understand the feasibility of using FIB aggregation for the long-term architectural solutions because it does not
to extend router lifetime, we present several FIB aggregation address the root causes of the routing scalability problem.
algorithms and evaluate their performance using routing tables Instead, FIB aggregation is a local solution that can be quickly
and updates from tens of networks. We find that FIB aggregation implemented and deployed in the short-term, and in the long
can reduce the FIB table size by as much as 70% with small
computational overhead. We also show that the computational run, it can co-exist and complement architectural solutions.
overhead can be controlled through various mechanisms. The idea of FIB aggregation is rather intuitive, but to our
best knowledge, no study has evaluated its potential benefits
I. I NTRODUCTION or costs. FIB aggregation is an opportunistic technique – its
effectiveness depends on what prefixes are present in the table,
The global Internet routing table has been growing at an how many of them can be numerically represented by a single
alarming rate ([21], [13], [20]), driven in part by the increasing prefix, and how many of them share the same next-hop in
number of organizations connected to the Internet, and in data forwarding. The benefits of FIB aggregation come with
part by the increasing practices of multi-homing and traffic certain costs, such as extra CPU overhead. The costs also
engineering. This rapid increase in routing table size appears to depend on the actual aggregation algorithms, and how routing
outpace the increase in memory size, especially for the type of changes are handled to update the forwarding table. A thorough
memory used in line cards for fast lookup. Moreover, it forces understanding of both factors is needed in order to decide
ISPs to upgrade their routers at a faster pace, which not only whether FIB aggregation is a viable solution.
causes higher operational cost to the ISPs, but also makes issues This paper conducts a systematic analysis and evaluation
such as power consumption and lookup speed more prominent. of FIB aggregation to understand its gains and costs. We
This routing scalability problem has raised concerns in both recognize that there can be different levels of aggregation, each
industry and research communities, as documented in the report representing different tradeoffs between table size reduction
from the IAB Workshop on Routing and Addressing [21]. and computation complexity. We design and implement five
Several solutions have been proposed under the IRTF RRG [3] algorithms at different aggregation levels, and evaluate them
and IETF GROW [2] working groups (see Section V for more using publicly available routing tables from tens of networks.
information). In order to address the root cause of the scalability The results show that the lowest-level aggregation can reduce
problem, we need fundamental changes to the Internet routing table size from 50% to 70%, equivalent to the full table size
architecture and protocols. However, deploying these architec- two and half years ago, while the highest-level aggregation
tural changes may take a long time, as we have learned from can reduce the table size to 30%, equivalent to the full table
the deployment of IPv6. While we believe that architectural size eight years ago. The computation time of one aggregation
changes will be beneficial to the Internet in the long run, we run ranges from 50 milliseconds to 250 milliseconds on a
should not avoid short-term solutions that can help ISPs reduce commodity Linux machine. Although these numbers may not
their operational costs, which will in turn be beneficial to end reflect the computation time on a router, they reflect the relative
users. In particular, ISPs urgently need a solution to reduce speed of different levels of aggregation. Moreover, such full
their forwarding table size. Forwarding tables are derived from aggregation is performed only when certain thresholds are
routing tables and router configurations, so their size increases reached, so the computation time is amortized over time.
as routing tables grow. However, forwarding tables use high To handle routing changes, we design and implement al-
performance memory that is more expensive and more difficult gorithms to incrementally update the aggregated FIB upon a
to scale than the memory used to hold routing tables. Therefore, change. The full aggregation algorithm is only invoked when
their size is a more immediate concern to ISPs and vendors. the router CPU load is low or the FIB size becomes above
2
possibly millions of entries in its FIB. This is in part due to

the sheer growth of the Internet, and in part due to the lack of
aggregation. With multi-homing [8], where a customer network
buys Internet service from multiple providers for resilient
Internet connectivity, the customer’s address prefix(es) must
be visible in the global routing table in order to be reachable
through any of its providers, thus breaking down provider-
based aggregation. Traffic engineering is another contributing
factor. For example, a network may try to influence the paths
of specific incoming traffic flows by splitting its address prefix
into several longer prefixes and injecting the longer prefixes at
different network attachment points. Splitting prefixes is also
Fig. 1. RIB and FIB used as a defense mechanism against prefix hijacking, since
hijacking longer prefixes is less effective than hijacking shorter
a threshold. The computation time can be further reduced
prefixes due to router’s longest-match routing lookup. Growing
by not aggregating highly active prefixes, which are a small
table size causes increasing FIB tables, RIB tables, and routing
fraction of the routing table but are responsible for many routing
churns. Among these problems, ISPs and vendors are more
changes. The evaluation using one-month BGP routing updates
concerned about the FIB size than RIB size, because it is more
shows that compared with un-aggregated FIB, the computation
difficult to scale up the memory in line cards than in route
overhead of maintaining aggregated FIB over time is small.
processors [13].
Overall, our conclusion is that FIB aggregation is a viable
solution to the routing scalability problem, since it provides The conventional way of reducing routing table size is to
significant reduction in table size, and its extra computation can aggregate RIB, which will also reduce FIB size. However, RIB
be controlled by various mechanisms, such as using different aggregation has very limited adoption in the Internet. At a
levels of aggregation, incremental update algorithms, on-deman prefix’s origin network, there is little incentive to aggregate
invocation, and not aggregating highly active prefixes. the prefix, because the gain of aggregating a small number of
The rest of the paper is organized as follows. Section II self-originated prefixes does not make much difference to the
gives an overview of FIB aggregation, why it may be effective table size. At the same time, the origin network actually has
and what tradeoffs are involved. Section III presents various incentives, such as multi-homing and traffic engineering, to split
aggregation techniques and algorithms. Section IV evaluates the the prefix. At a remote site, aggregation opportunity is limited
full aggregation algorithms as well as techniques to maintain since two prefixes must have the same path attributes in order
the aggregated FIB upon routing changes. Section V discusses to be aggregated. Otherwise their path information will be lost
related work, and Section VI concludes the paper. and protocol functions may be affected. Forcing aggregation of
RIB prefixes that have different paths would also defeat multi-
II. FIB AGGREGATION homing and traffic engineering by the prefix origin networks.
There are two types of tables used by routers for routing and FIB aggregation may be more effective than RIB aggregation
forwarding: Routing Information Base (RIB) and Forwarding in reducing FIB size since it only requires prefixes to have
Information Base or FIB (Figure 1). RIB is stored in the main the same next-hop in order to be aggregated. For example,
memory of a route processor. The route processor receives and considering an LA router connecting to a Tokyo router, which
processes routing update messages and runs routing protocols, in turn connects to a Beijing router and a Shanghai router. The
e.g., OSPF [22] and BGP [25], to compute the RIB. Each RIB LA router may reach prefixes announced by China Telecom via
entry contains the destination IP prefix and associated route different paths, some via Beijing and some via Shanghai. How-
information. For example, BGP maintains full AS path and ever, in its FIB, most of these prefixes take the Tokyo router
many other attributes for each prefix in RIB. FIB is derived as the next-hop, thus creating opportunity for aggregation.
from RIB and router configurations. It is stored in line cards, Besides sharing the same next-hop, prefixes also need to be
whose job is to forward data packets. Therefore, FIB usually numerically aggregatable. This is possible due to two factors.
uses high performance memory, which is more expensive and First, in IP address allocation, large blocks of Internet addresses
more difficult to scale. For each destination IP prefix, the FIB are first allocated to Regional Internet Registries and then they
has an entry to store the next-hop IP, next-hop MAC address and further allocate the addresses to networks within the same
outgoing interface for fast data forwarding. Figure 1 illustrates region. A router outside the region tends to use the same
these different components in a router. nexthop to reach these address prefixes, which can then be
Despite growth constraints such as a shortage of IPv4 ad- aggregated. Second, for prefixes split for traffic engineering or
dresses and strict allocation policies [21], the global routing other purposes, a router near the origin network is likely to have
table in the Internet’s default free zone (DFZ) has been growing different next-hops, but a router further away from the origin
at an alarming rate in recent years. Currently, a router in network is more likely to have the same next-hop towards these
the DFZ has hundreds of thousands of routes in its RIB and numerically aggregatable prefixes.
3
Therefore, although FIB aggregation is opportunistic and table lookup ends up with NULL next-hop, will still be non-
the aggregation degree varies from router to router, there are routable. This aggregation does not introduce any new prefix
some inherent properties of the Internet that can make FIB nor extra routable space into the table.
aggregation effective. If FIB aggregation is effective in reducing The algorithm implementing this technique simply traverses
table size, its most appealing feature is that the impact is limited the tree recursively from the root node in postorder. When it
within a router’s data plane. It does not change any routing arrives at a node with a prefix, it compares this prefix’s next-hop
protocols, or any router’s routing decisions. Data traffic still with its immediate ancestor prefix’s next-hop. If they have the
flows on the same router paths. Therefore, it can co-exist with same next-hop, it labels the current node non-FIB, otherwise
almost any new routing protocols, including those architectural labels it in-FIB. The immediate ancestor prefix’s next-hop is
solutions to the routing scalability problem in the long run. updated and remembered during the tree traversal. Eventually
The idea of FIB aggregation is not new. It was mentioned every prefix node is labelled as either non-FIB or in-FIB, and all
as a potential strategy in “Preliminary Recommendation for a in-FIB prefixes comprise the aggregated FIB. The aggregation
Routing Architecture” [19]. Through personal exchanges, we is done recursively throughout the entire table. The computation
have learned that one small vendor has implemented a simple time is O(n), where n is the total number of nodes in the tree.
FIB aggregation scheme (similar to our Level-1 aggregation). b) Level 2 Aggregation: this technique is illustrated in
There is also a patent for a FIB aggregation algorithm [9]. Figure 2(b). In addition to performing Level 1 aggregation,
However, to our best knowledge, we are the first to study more Level 2 combines sibling prefixes that share the same next-hop
complex FIB aggregation techniques, present an in-depth anal- into a parent prefix. If the parent node already has a prefix with
ysis of different levels of FIB aggregation, and systematically a different next-hop, then the aggregation cannot be done. Or
evaluate its effectiveness and overhead. if the parent node already has a prefix with the same next-hop,
then it is part of Level 1 aggregation. Therefore, Level 2 is
III. FIB AGGREGATION T ECHNIQUES AND A LGORITHMS
done when the parent node has no prefix. The net result is to
There are two main questions in designing FIB aggregation introduce a new prefix to cover two existing prefixes, but there
techniques: how to aggregate the full FIB and how to update is no extra routable space introduced, i.e. the aggregated FIB
an aggregated FIB upon a routing change. We consider four covers the exact address space as the un-aggregated FIB.
levels of full FIB aggregation, each associated with different The algorithm implementing Level 2 aggregation traverses
tradeoffs. We also propose a few techniques to reduce the the tree recursively from the root node in postorder. Besides
computation time in updating the FIB. The algorithms presented doing Level 1 aggregation, when it arrives at a node without
in this section assume the routing table is stored in a tree a prefix, it compares this node’s two children. If both children
structure. Though our implementation uses patricia trie, a tree- have prefixes and use the same next-hop, then both children are
like data structure widely used for routing tables since the BSD labelled non-FIB, and this current node is assigned the parent
implementation, the algorithms should apply to any tree data prefix and labelled in-FIB. The aggregation is done recursively
structure. Note that our algorithms do not build any additional throughout the entire table. The computation time is O(n),
trees just for aggregation; we simply use the existing trees that where n is the total number of nodes in the tree.
the RIB and FIB use. For a network device that uses non- c) Level 3 Aggregation: this technique is illustrated in
tree data structure to implement routing tables, the general Figure 2(c). In addition to performing the Level 1 and 2
techniques discussed here still apply. aggregation, Level 3 aggregates a set of non-adjacent prefixes
that have the same next-hop into a super prefix. Between
A. Full FIB Aggregation
these non-adjacent prefixes, non-routable space is allowed. For
To ensure packet delivery and avoid changing the path that example, in Figure 2(c), at the bottom level of the tree, there
a packet will take, a FIB aggregation scheme should satisfy are two nodes with address prefixes (real nodes) sharing the
the property of forwarding correctness: after longest match same next hop. However, these two nodes are separated in
lookup, if a destination address has a non-NULL next-hop in the tree by two nodes without address prefixes. The prefixes
the un-aggregated FIB, it should have the same next-hop in the of the two real nodes can be aggregated into a grandparent
aggregated FIB. With this requirement satisfied, there can be prefix. A side effect is that this newly inserted prefix covers
four levels of aggregation (Figure 2). Each level aggregates the previously non-routable space, therefore some previously non-
table more but incurs more computation and other overhead. routable traffic (which would have been dropped by this router)
a) Level 1 Aggregation: this technique is illustrated in will be forwarded along the next-hop of the aggregate prefix.
Figure 2(a). The simplest form of aggregation is to remove This does not violate the correctness requirement since all
prefixes that share the same next-hop with their immediate previously routable traffic is still routable and will follow the
ancestor prefixes, in which case we say that the “covered same path. The impact of introducing extra routable space
prefix” has the same next-hop as the “covering prefix” and into the FIB depends on how much traffic is destined to that
can be removed from FIB. Addresses that previously match address space. In normal operational conditions, the volume
the covered prefix now will match the covering prefix and still of such traffic should be negligible. However, malicious traffic
get the same next-hop. Previously non-routable packets, whose such as port scanning usually explores such non-routable space
4
A A
A A A
A A A A A A B A B
(a) Level 1: Removing (b) Level-2: Combining (c) Level-3: Allowing extra routable (d) Level-4: Allowing holes in the aggre-
covered prefixes sibling prefixes space gate
Fig. 2. Different Levels of FIB Aggregation. The binary tree represents part of the IP address space. Nodes labeled with letters are prefixes in the routing
table, and the letter represents the next-hop for the prefix. Nodes without labels do not have their corresponding prefixes in the routing table. Filled nodes are
extra routable space introduced by the aggregation.
and in certain cases it may become noticeable. Eventually a node with next-hop B is allowed to be between the prefixes
these packets will be dropped, either because they arrive at a being aggregated, punching a “hole” in the aggregate prefix.
router that does not have a route for these packets, or because This type of aggregation maintains forwarding correctness and
the packet’s time-to-live expires. These packets will consume may also introduce extra routable space as Level 3 does. For the
bandwidth during transit and that is a new type of overhead same reason as in Level 3, our algorithm only applies this type
introduced by Level 3 aggregation. A good Level 3 algorithm of aggregation to prefixes that do not have ancestor prefixes.
should consider the tradeoff between table size reduction, extra The seemingly trivial difference between Level 4 and Level
routable space introduced, and computation time. 3 actually has significant implication to algorithm design. It al-
Level 3 aggregation must be implemented with care to ensure lows the maximum flexibility for aggregation. However, taking
its forwarding correctness. For example, in Figure 2(c), two advantage of it may also require significant computational time.
grandchildren prefixes are aggregated into one grandparent For example, given a set of non-adjacent prefixes with different
prefix. This would be incorrect if there is already a great- next-hops, which super-prefix should be inserted? Which next-
grandparent prefix (not shown in the figure) covering the hop should the super-prefix take? Finally, how should the
subtree with a different next-hop B, because that means the two decision be made without too much computational complexity?
middle nodes at the bottom level are not non-routable space and In this paper, we present and evaluate two different Level 4
their next-hops would change from B to A after the aggregation. algorithms described as follows.
In order to handle this case without introducing much compu- The Level 4A algorithm traverses the tree recursively once in
tation overhead, we decide that in our implementation we only postorder. Besides doing Level 1, 2 and 3 aggregations, when
apply this type of aggregation to prefixes that do not have any it arrives at a prefix that does not have any ancestor, it returns
existing ancestor prefix. In a typical DFZ routing table, about a pointer of this prefix node to its parent, which will further
half of all the prefixes have no ancestor and the other half have pass this pointer up along the tree. An upper level node will
ancestors. The prefixes that have ancestors can be aggregated receive two lists of its descendants, one from its left child and
by Level 1 and Level 2, therefore our choice does not lose too the other from its right child. This node combines the two lists
much aggregation capability. to get all its descendants and their next-hops, picks the most
The algorithm implementing Level 3 aggregation traverses popular next-hop as its own next-hop and inserts a prefix at this
the tree recursively in postorder. Besides doing Level 1 and node. All the descendants that use the most popular next-hop
Level 2 aggregation for all nodes, when it arrives at a prefix will be labelled non-FIB, and other descendants are labeled in-
that does not have any ancestor, it checks whether this prefix FIB. If there are multiple next-hops tie for the most popular,
has a sibling node that does not have a prefix. If yes, it returns a then one of them will be randomly selected. The computation
pointer of this prefix node to its parent node, which will further time is O(n), where n is the number of nodes in the tree.
pass this pointer up along the tree. When an upper level node The Level 4B algorithm is based on Herrin’s proposal [5].
has two such pointers, one from a left descendant and another It traverses the tree twice. The first step traverses the tree
from a right descendant, and these two descendants have the recursively in postorder, which is like sweeping all tree nodes
same next-hop, then a new prefix is created at this upper level from bottom up. During this process, the algorithm calculates
node and labelled in-FIB, while the two descendant nodes are the most popular next-hop among all descendant prefixes of a
labelled non-FIB. If the two descendants have different next- node and records this next-hop with the node unless this node
hops, then aggregation cannot be done and they remain in-FIB. already has a prefix with a different next-hop. The second step
The computation complexity is O(n), where n is the number of traverses the tree recursively in preorder, which is like going
nodes in the tree. through all tree nodes from top down. During this process, the
d) Level 4 Aggregation: this technique is illustrated in algorithm tries to insert new prefixes with the most popular
Figure 2(d). In addition to performing Level 1, 2 and 3 aggre- next-hop from all descendants (not just immediate descendants
gation, Level 4 aggregates a set of non-adjacent prefixes with as in Level 4A), as calculated in the previous postorder tree
the same next-hop. The difference from Level 3 aggregation is traversal, and label descendant prefixes non-FIB or in-FIB
that, in Level 4, between the non-adjacent aggregated prefixes, accordingly. When there are multiple equally popular next-
other prefixes with different next-hops are allowed, while Level hops, we randomly select one. Under certain conditions a newly
3 only allows non-routable space. For example, in Figure 2(d), inserted prefix at a higher level of the tree may be redundant
5
and will be removed. The computation time is O(n), where n is to the un-aggregated FIB, but the aggregated FIB actually does
the number of nodes in the tree. It tries to do a more thorough not need to be updated. Therefore, how many changes happen
aggregation than Level 4A, but will take longer time since it to the aggregated FIB and how many nodes each change affects
traverse the tree twice. depend on the content of the routing updates. In Section IV,
we use one month of BGP routing updates to evaluate the
B. Handling Routing Updates
incremental update algorithms and find that the number of
Internet routes change over time, thus the obvious question is updated nodes per change is close to one, similar to updating
how to update the aggregated FIB when there is a change. Re- an un-aggregated FIB.
run the full FIB aggregation will maintain the best aggregation
all the time, but it will also incur significant computation C. Discussion
overhead since the full FIB aggregation accesses every tree Our aggregation techniques ensure forwarding correctness by
node and updating an un-aggregated FIB only looks up and aggregating prefixes with the same next-hop. In reality, there
updates a single prefix node. are other types of information stored in a FIB entry, such as
We use the combination of four mechanisms to make sure packet count. Aggregating two prefixes will probably lose such
that the computation cost of updating aggregated FIBs is under fine-grained statistics, which also happens in all other routing
the control of operators. First, operators can choose the level scalability solutions, usually to a much wider extent. Losing
of full FIB aggregation that suits their routers the best. Routers more detailed information is a necessary cost we have to pay in
with faster CPU and fewer routing updates can use higher order to improve overall system scalability. One way to mitigate
level FIB aggregation, otherwise they can use lower level FIB the problem is having a router-wide packet counter instead of
aggregation. Second, we design an algorithm that updates the per-FIB packet counter, thus the aggregated information from
aggregated FIB incrementally. The algorithm tries to minimize all line cards are still kept for each prefix. Another way is
the number of tree nodes that have to be accessed and changed for the operators to configure some important prefixes not be
to maintain forwarding correctness after the routing change. It subject to FIB aggregation, thus fine-grained statistics of these
does not attempt to keep table size small. Third, the full FIB important prefixes will be kept.
aggregation is only invoked when needed, e.g., the table size Our aggregation algorithms have assumed a generic tree
has crossed a threshold after being incrementally updated for structure to store the routing tables, and we have not attempted
a while, or when the router has free CPU cycles to spare, i.e., to optimize either the algorithm or the implementation. Our goal
the router load is under a threshold. Fourth, as shown in [24], is to show that, without special optimization, FIB aggregation in
a small number of highly active prefixes are responsible for a general is a viable solution to the scalability problem with good
large number of routing changes. Therefore we can keep these table size reduction and small computational overhead. When
highly active prefixes in the FIB without aggregating them out. FIB aggregation is adopted in real networks, the algorithms and
Then every time they have a routing change, the update process implementations can always be optimized for specific hardware,
is the same as updating an un-aggregated FIB. Operators can operating systems, and routing table data structures.
configure the criteria for deciding which prefixes should be kept
in FIB regardless of the aggregation process. IV. E VALUATION
The basic idea of the incremental update algorithm is as We use publicly available routing tables from tens of net-
follows. For each routing change, the algorithm looks up the works to evaluate the various FIB aggregation algorithms for
prefix in the RIB, and finds out the prefix node’s nearest their table size reduction, computing times, and extra routable
prefix ancestor, and all the nearest prefix descendants. The space. We also use BGP routing updates to evaluate our FIB
nearest prefix ancestor is the nearest ancestor node that has update algorithm.
a prefix. Similarly, nearest prefix descendants are descendant
nodes that have prefixes and there is no other prefix descendants A. Methodology
in between. In the worst case, all the nearest prefix ancestor The effectiveness of FIB aggregation depends on which
and nearest prefix descendant nodes need to be updated. For prefixes are present in the routing table and how these prefixes
example, when an aggregate prefix in Level 1 aggregation are distributed over next-hop routers. Generally speaking, the
is withdrawn, in the worst case, all of its nearest prefix fewer neighbors a router has, the better aggregation it may
descendants need to be converted from non-FIB back to in- achieve. In the extreme case, all prefixes share the same next-
FIB. But in many cases, we are able to narrow the changes to hop and the aggregation is maximized. According to Li et
a smaller number of nodes. For example, when a new prefix is al. [18], although some routers have high degrees up to a couple
announced, any of its nearest prefix descendants who shares the of hundreds, these routers connect to a large number of end-
same next-hop with the new prefix does not need any change. customers, not transit neighbor routers. Therefore, they will still
We do not even remove them from the FIB since the goal here use only a small number of next-hops, i.e., the transit neighbors,
is to minimize changes, not to maintain small table size. In to reach most of the address prefixes.
certain cases, we can actually avoid changes that would have We evaluate different FIB aggregation algorithms using pub-
happened to un-aggregated FIBs. For example, in Figure 2(a), licly available BGP routing tables taken from route servers [1]
if the covered prefix is withdrawn, it would require a change and the RouteViews project [6]. Although these routing tables
6
Percent of prefixes satisfying assumption

1.1
0.9
0.8
0.7
0.6
Fig. 3. Next-hop, iBGP neighbor, and next-AS-hop
0.5
su he be lg. t gt. l e e g
nri . l p wt c g.sp astl u.gb blx.
se net wue tt.an eleco a .pt i n
contain valid next AS hops, they either do not have next-hop .ch .de sp m t.b nk.ca lx.ne et
.br .ne r t
router information or do not reflect the diversity of next-hops t
that an operational router typically has, since the route monitors Route Server
are not operational routers. Therefore we need to generate Fig. 4. Prefixes sharing the same next-AS-hop usually use the same iBGP
realistic next-hops based on known information. Our guideline neighbor.
of this process is trying to overestimate the number of next-
hops so that the table reduction results reflect the worst case update algorithms have been verified using this method.
scenario, and real routers are likely to have better aggregation All the evaluation has been done on a Linux machine with
ratio. The specifics of generating next-hops are as follows. an Intel Core 2 Quad 2.83GHz CPU using a single thread.
Tables downloaded from route servers contain the iBGP The algorithms are implemented in C and no performance
neighbor address for each route. Assuming intra-domain routing optimization techniques have been attempted. The Patricia trie
uses a single best path, prefixes that share the same iBGP implementation is taken from the C source code of Perl’s
neighbor will share the same next-hop. Thus we use the iBGP Net::Patricia module [4], which in turn was adapted from
neighbor as the next-hop in evaluations (see Figure 3 for the MRTD’s [23] source code.
relationship between next-hop, iBGP neighbor and next-AS- We use the public BGP routing tables to do the evaluation
hop). This reflects the worst case scenario since prefixes using because these tables come from a diverse set of networks,
different iBGP neighbors may actually use the same next-hop from tier-1 ISPs to small networks. However, in operational
router in reality, which will improve aggregation. networks, there are other types of routes, such as VPN routes,
Tables downloaded from RouteViews do not even contain which can be of a large number too. The FIB aggregation
iBGP neighbor address. They only contain the AS path for algorithms can be applied to these other types of routes as well,
each prefix. From the AS path, we can extract the next-AS- even though this paper does not evaluation the effectiveness of
hop for each prefix. We assume that prefixes sharing the same doing so. We plan to obtain forwarding tables from operational
next-AS-hop also share the same iBGP neighbor. This is true if ISP routers to validate our results.
there is only one router connecting to the neighbor AS. If there
are multiple border routers connecting to the same neighbor B. Table Size Reduction and Overhead
AS, according to BGP’s decision process, a router will pick We applied our algorithms to 37 routing tables archived at
the nearest border router to reach the neighbor AS. Therefore, RouteViews on Dec. 31, 2008 and then calculated the ratio
barring customized route selection for specific prefixes, our between the aggregated FIB size and the original routing table
assumption is also true even if there are multiple connections to size. The results are presented in Figure 5(a) (the routers are
a neighbor AS. We use tables from route servers to validate this. ordered based on the aggregation ratio). We can make the
For each next-AS-hop, if there is only one iBGP neighbor, then following observations: (1) each level of aggregation can reduce
all the prefixes using this next-AS-hop satisfy the assumption. the FIB size more than the previous level, which is as expected;
If there are multiple iBGP neighbors, the one that carries the (2) even with the simple Level 1 aggregation, the FIB size can
most prefixes is called “popular,” and the prefixes that use the be reduced by 30% to 50%; (3) Level 4 aggregation can reduce
popular iBGP neighbors satisfy the assumption. As shown in the FIB size from 60% to over 90% with a median around 70%
Figure 4, most tables have more than 90% of prefixes satisfying – some of the tables have almost all the prefixes sharing the
the assumption. Based on this assumption, we use next-AS-hop same nexthop, so they can be aggregated into very few entries;
as the next-hop router in evaluation. This reflects the worst case and (4) the Level 4A algorithm is slightly better than the Level
scenario since large networks have hundreds to thousands of 4B algorithm, although the difference is almost negligible. The
neighbor ASes, but the number of real next-hops should be results for the route servers are similar, so we omit them from
much smaller. the paper.
We verify the correctness of each aggregated FIB by looking To evaluate the effectiveness of our algorithms over a longer
up every original RIB prefix and its sub prefixes in the FIB, period of time, we applied them to the RouteViews routing
which should give the same next-hop as that in the RIB. All the tables from 2001 to 2008. For each year, we obtained the
results from our FIB aggregation algorithms and incremental median aggregation ratio of all the routing tables. Figure 5(b)
7
1 1
Level 1 Level 1
0.9 Level 2 0.9 Level 2
Table Size Ratio (Aggregated FIB/RIB)
Table Size Ratio (Aggregated FIB/RIB)

Level 3 Level 3
0.8 Level 4 (A) 0.8 Level 4 (A)
Level 4 (B) Level 4 (B)
0.7 0.7
0.6 0.6
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0
0 4 8 12 16 20 24 28 32 36 2001 2002 2003 2004 2005 2006 2007 2008
Router ID Year
(a) RouteViews Tables, 2008/12/31 (b) RouteViews Tables, 2001 - 2008
Fig. 5. Ratio between Aggregated FIB size and Routing Table Size
300000 100 275

Size
Level 1
90 250 Level 2
Level 3
250000 225 Level 4 (A)
80
Level 4 (B)
Computation Time (ms)

200
Percent compared to 2008/12/31
70
200000 175
60 150
RIB Size
150000 50 125
100
40
100000 75
30
50
20
50000 25
10 0
0 4 8 12 16 20 24 28 32 36
0 0 Router ID
199
199
199
199
199
199
200
200
200
200
200
200
200
200
200
Fig. 7. Computing Time (RouteViews Tables)

4
Time
Fig. 6. Historical RIB size 180

Level 3
160 Level 4 (A)
Level 4 (B)
Extra Address Space (# /8 prefixes)
shows an overall decreasing trend, suggesting that the FIB has 140
become more amenable to aggregation over the years. One 120
possible explanation is that the increasing practice of prefix 100
splitting due to multi-homing and traffic engineering has made 80
a larger percentage of FIB entries aggregatable. We plan to 60
further investigate this phenomenon in our future work.
40
In order to understand the significance of the above results,
20
we use the size of the routing table from 1994 to 2009 to
0
translate the table size reduction into how many years we turn 0 4 8 12 16 20 24 28 32 36
Router ID
the clock back for a router (see Figure 6). The data is obtained
from bgp.potaroo.net, a site that tracks the growth of the BGP Fig. 8. Extra Routable Address Space (RouteViews Tables)
table size. This figure shows that the FIB size in 2001 is around
30% of the FIB size on 12/31/2008, which means that if an ISP time from what we report here. Nevertheless, the simplicity
uses our Level 4 aggregation algorithm, it will still be able to of the algorithms and the very short computing time suggest
use routers that were deployed in 2001. Assuming these routers that the computational overhead in an operational router may be
were sufficiently provisioned when they were purchased seven small. Moreover, our results can be used to compare the relative
or eight years ago, they should still be able to accommodate speed between different aggregation algorithms. For example,
some further routing table growth. we can observe that the Level 4B algorithm is more computing
Figure 7 shows the computing time incurred by our algo- intensive than the Level 4A algorithm.
rithms to aggregate each of the 37 routing tables. The Level 1 In Figure 8, we show the amount of extra routable space
to 3 algorithms typically require tens of milliseconds, while (measured by the number of /8 prefixes) introduced by three of
the Level 4 algorithms consume at most a few hundreds our algorithms. Since Level 1 and 2 algorithms do not introduce
milliseconds. An operational router may have slower CPU than new routable space, we do not present them here. We can see
our commodity Linux machine, but it has specialized hardware that the Level 3 algorithm is better than the Level 4 algorithms
and software, thus it is hard to infer a router’s computing in this regard, while the Level 4A algorithm covers much more
8
Algorithms No. of Changes Total Proc. Time (s) Avg. Proc. Time (µs) No. of Trie Nodes
per Change Affected per Change
RIB 7254478 N/A N/A N/A
Un-Aggregated FIB 3048038 14.039 4 1
Level-1 Aggregation 2904608 15.034 5 1.02
Level-4 Aggregation (A) 2890316 14.039 4 1.02
TABLE I
P ROCESSING ROUTING U PDATES IN D ECEMBER 2008
210000
routable space than the Level 4B algorithm. One technique to
200000
prevent introducing a large amount of extra routable space is to 190000
cap the aggregation at a certain prefix length, which is already 180000
adopted by our Level 4B algorithm. This may explain the large 170000
FIB Size
160000
difference between the two algorithms. The fact that the two
150000
Level 4 algorithms have similar table reduction ratios suggests 140000
that limiting the aggregated prefix length could effectively 130000
reduce the amount of extra routable space without significantly 120000

110000
affecting the table reduction ratio. We plan to further investigate
100000
this issue in future work. 12/04/08 12/09/08 12/14/08 12/19/08 12/24/08 12/29/08
Time
C. Routing Update Handling Fig. 9. FIB size after applying Level 4A aggregation algorithm initially and
incremental update handling algorithm subsequently
We use one month (December 2008) of BGP routing updates
collected by RouteViews from the Level 3 Communications 150000
peer router to evaluate our incremental update handling al- 140000

gorithms. We summarize our results in Table I. There were
over 7 million routing updates during this month. For each 130000
FIB Size
unaggregated or aggregated FIB, we counted how many of 120000

these routing updates would cause a change to the FIB, i.e. an
insertion, removal, or a change to the next-hop of a FIB entry. 110000
There were 3,048,038 changes to the unaggregated FIB. Note, 100000

however, that an aggregated FIB may experience fewer changes
90000
than the unaggregated FIB. For example, the aggregated FIB 12/02/08 12/08/08 12/14/08 12/20/08 12/26/08 01/01/09
Time
from our Level 4A algorithm had 157,722 fewer changes than
Fig. 10. FIB size with periodic re-aggregation
the unaggregated FIB. This is due to two reasons. First, some
of the routing changes may be for prefixes already removed
from the FIB by the aggregation algorithm. Second, our update 150,000 entries (about 55% of the full routing table size), the
algorithm minimizes the number of FIB changes at the cost FIB would be re-aggregated five days later on Dec. 6, 2008.
of slightly increased FIB size. On the other hand, each FIB Considering that each full aggregation run takes at most a few
change may affect slightly more Patricia trie nodes in the hundred milliseconds on our commodity PC (perhaps a little
case of the aggregated FIB, as shown in the last column of longer on a router), incurring this overhead every few days or
Table I. As a result, the processing time for aggregated and so should not be a concern for an ISP. Figure 10 shows the
unaggregated FIBs are very close. Each routing update took FIB size when we use 150,000 as the threshold to trigger the
at most an additional 1 µs to process (5 µs per update for full aggregation. It indeed shows that the full aggregation was
the Level 1 aggregated FIB compared to 4 µs per update for triggered every few days.
the unaggregated FIB). In other words, our update handling Previous studies ([24], [26]) have shown that a small set of
algorithm for FIB aggregation has minimal impact on the highly active prefixes contribute to most of the routing updates.
processing time for routing updates. We may be able to exploit this characteristic to further reduce
Since our update handling algorithm trades off the FIB size the frequency of changes to the aggregated FIB (at the cost of
for fewer changes, the FIB needs to be re-aggregated when its increased FIB size). To evaluate the feasibility of this approach,
size reaches a certain threshold. To estimate how frequently we counted the number of changes to an unaggregated FIB
the re-aggregation will be triggered, we measure the growth caused by each prefix during Dec. 2008. We found that over
of the FIB size as our algorithm handles the BGP updates the one-month period, 99% of the prefixes had fewer than 4.5
during the month of Dec. 2008 (see Figure 9). The Level 4 changes per day. Meanwhile, 873 prefixes had more than 10
aggregated FIB had 99,676 entries on Dec. 1, 2008 (37.3% of changes per day and a few most active prefixes incurred 300
the routing table size). If we set the FIB size threshold to be FIB changes per day. If we avoid performing aggregation on
9
this small set of prefixes (i.e. leave such a prefix in the FIB even forwarding tables from operational ISP routers to validate our
if it is aggregatable), we may be able to reduce the frequency results. Second, we would like to reduce the extra routable
of inserting these prefixes to the FIB and removing them from space introduced by our algorithms without significantly af-
the FIB. Incorporating this idea into our algorithms will be part fecting their effectiveness. Third, we will investigate whether
of our future research. avoiding aggregation of highly dynamic prefixes will reduce the
number of FIB changes. Fourth, we will study IP-range based
V. R ELATED W ORK
aggregation algorithms and associated FIB lookup techniques.
In response to the routing scalability problem, the IRTF Finally, we would like to conduct an optimality analysis of
Routing Research Group (RRG) [3] was formed in search for FIB aggregation techniques, that is, whether optimal algorithms
a long-term solution. Many concurrent proposals are being exist given certain constraints such as the amount of extra
discussed on the RRG mailing list and at RRG meetings. routable space that can be introduced by an algorithm.
In previous work [17], we classified the proposed solutions
into two categories, separation and elimination. One of the R EFERENCES
separation approaches is Map-and-Encap ([10], [15]). Several [1] BGP4.net Wiki. http://bgp4.net.
[2] IETF Global Routing Operations (GROW). http://www.ietf.org/dyn/wg/
recently proposed schemes, e.g., LISP [11], APT [16], Ivip [27], charter/grow-charter.html.
TRRP [14], are realizations of the Map-and-Encap concept. [3] IRTF Routing Research Group. http://www.irtf.org/charter?gtype=rg\
These long-term solutions aim to reduce the routing table size, &group=rrg.
[4] Net-Patricia Perl Module. http://search.cpan.org/dist/Net-Patricia/.
which inevitably involves changes to the routing architecture [5] Opportunistic Topological Aggregation in the RIB-FIB Calculation? http:
and protocols. However, these changes generally take a long //www.ops.ietf.org/lists/rrg/2008/threads.html#01880.
time to become a reality, if they ever happen. These archi- [6] Advanced Network Technology Center and University of Oregon. The
RouteViews project. http://www.routeviews.org/.
tectural solutions address the root causes of the routing table [7] H. Ballani, P. Francis, C. Tuan, and J. Wang. Making Routers Last Longer
growth, but they usually change the paths traffic will take, and with ViAggre. In NSDI, 2009.
incur processing overhead in encapsulating and decapsulating [8] T. Bu, L. Gao, and D. Towsley. On Characterizing BGP Routing Table
Growth. Computer Networks, 45(1):45–54, may 2004.
data packets. [9] B. Cain. Auto aggregation method for IP prefix/length pairs. http://www.
Bill Herrin proposed FIB aggregation as a potential strat- freepatentsonline.com/6401130.html, June 2002.
egy in “Preliminary Recommendation for a Routing Archi- [10] S. Deering. The Map & Encap Scheme for Scalable IPv4 Routing with
Portable Site Prefixes. Presentation, Xerox PARC, March 1996.
tecture” [19]. He described the basic idea of our Level 4B [11] D. Farinacci, V. Fuller, D. Oran, D. Meyer, and S. Brim. Locator/ID
algorithm, which includes a few improvements of our own. Separation Protocol (LISP). draft-farinacci-lisp-08, July 2008.
Another proposed approach to reducing FIB size is Virtual [12] P. Francis, X. Xu, H. Ballani, D. Jen, R. Raszuk, and L. Zhang.
FIB Suppression with Virtual Aggregation. http://tools.ietf.org/html/
Aggregation ([12], [7]). Virtual Aggregation designates a small draft-francis-intra-va-01, April 2009.
set of routers (APRs) that announce virtual prefixes, so that [13] V. Fuller. Scaling Issues with Routing+Multihoming. http://www.vaf.net/
∼vaf/apricot-plenary.pdf.
other routers do not need to install more specific prefixes under
[14] W. Herrin. Tunneling Route Reduction Protocol (TRRP). http://bill.herrin.
those virtual prefixes in their FIB – other routers simply forward us/network/trrp.html.
their packets to the APRs responsible for the corresponding [15] R. Hinden. New Scheme for Internet Routing and Addressing (ENCAPS)
virtual prefixes. It can be independently deployed by one ISP, for IPNG. RFC 1955, 1996.
[16] D. Jen, M. Meisel, D. Massey, L. Wang, B. Zhang, and L. Zhang. APT:
and does not require changes to the routing architecture or A Practical Tunneling Architecture for Routing Scalability. Technical
protocols. However, Virtual Aggregation requires considerable Report 080004, UCLA, 2008.
changes to network-wide router configurations and specialized [17] D. Jen, M. Meisel, H. Yan, D. Massey, L. Wang, B. Zhang, and L. Zhang.
Towards A Future Internet Architecture: Arguments for Separating Edges
routers to announce virtual prefixes. Moreover, it introduces from Transit Core. In ACM Workshop on Hot Topics in Networks, 2008.
extra delays (stretch) in packet delivery. To some extent, the [18] L. Li, D. Alderson, W. Willinger, and J. Doyle. A first-principles approach
Level 4 FIB aggregation described in this paper can be viewed to understanding the Internet’s router-level topology. In Proc. of ACM
SIGCOMM, 2004.
as a local Virtual Aggregation limited within a router. [19] T. Li. Preliminary Recommendation for a Routing Architecture. http:
//tools.ietf.org/html/draft-irtf-rrg-recommendation-02, March 2009.
VI. C ONCLUSION AND F UTURE W ORK [20] X. Meng, Z. Xu, B. Zhang, G. Huston, S. Lu, and L. Zhang. IPv4 Address
We have presented an in-depth analysis of FIB aggregation Allocation and BGP Routing Table Evolution. In ACM SIGCOMM CCR,
Janurary 2005.
and our results suggest that it is a viable short-term solution [21] D. Meyer, L. Zhang, and K. Fall. Report from the IAB Workshop on
to the problem of growing FIB table size. Our aggregation Routing and Addressing. RFC 4984, 2007.
algorithms reduces the FIB size by as much as 70% and requires [22] J. Moy. OSPF Version 2. RFC 2328, SRI Network Information Center,
September 1998.
no hardware changes or network-wide software/configuration [23] MRTD: The Multi-Threaded Routing Toolkit. http://www.mrtd.net.
changes, thus reducing the need for ISP router upgrades in the [24] R. Oliveira, R. Izhak-Ratzin, B. Zhang, and L. Zhang. Measurement of
short term. During this time, the research community and the Highly Active Prefixes in BGP. In IEEE GLOBECOM, 2005.
[25] Y. Rekhter, T. Li, and S. Hares. A Border Gateway Protocol (BGP-4).
industry can work on a long-term solution to reduce both the RFC 4271, Jan. 2006.
routing table and the FIB table. Moreover, FIB aggregation [26] L. Wang, X. Zhao, D. Pei, R. Bush, D. Massey, A. Mankin, S. F. Wu,
can co-exist with any long-term solution to further reduce and L. Zhang. Observation and Analysis of BGP Behavior Under Stress.
In ACM SIGCOMM IMW, 2002.
ISPs’ operational costs. We plan to continue our research on [27] R. Whittle. Ivip (Internet Vastly Improved Plumbing) Architecture. draft-
FIB aggregation in the following areas. First, we will obtain whittle-ivip-arch-02, August 2008.

Aggregate

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Aggregate

Uploaded by

Copyright:

Available Formats

On the Aggregatability of Router Forwarding Tables

Xin Zhao Yaoqing Liu Lan Wang Beichuan Zhang

possibly millions of entries in its FIB. This is in part due to

Percent of prefixes satisfying assumption

Table Size Ratio (Aggregated FIB/RIB)

Table Size Ratio (Aggregated FIB/RIB)

(a) RouteViews Tables, 2008/12/31 (b) RouteViews Tables, 2001 - 2008

300000 100 275

Computation Time (ms)

Fig. 7. Computing Time (RouteViews Tables)

Fig. 6. Historical RIB size 180

reduce the amount of extra routable space without significantly 120000

peer router to evaluate our incremental update handling al- 140000

unaggregated or aggregated FIB, we counted how many of 120000

There were 3,048,038 changes to the unaggregated FIB. Note, 100000

You might also like