Professional Documents
Culture Documents
The traditional approach of putting dedicated firewalls within a given physical loca-
tion in order to provide security services is indeed capable of scaling, but it comes at a
cost. Furthermore, within large-scale data center networks, the traditional approach to
securing data using firewall clusters isn’t often suitable because the data has grown to
proportions that no single firewall cluster is capable of handling.
Day One: Scaling Beyond a Single Juniper SRX in the Data Center elegantly addresses the
problem and provides unique insight into how to provide security to outbound traffic at
levels that can scale to meet the needs of even the largest networks. Follow along with
this proof of concept and get the configuration for doing so at the end.
“Scaling network security infrastructure can be a very challenging endeavor. This book cites
potential solutions to these challenges and offers an elegant architecture, one that allows
large scale and to add capacity rapidly with minimal effort.”
Daniel Sullivan, Senior Security Engineer, Zynga
IT’S DAY ONE AND YOU HAVE A JOB TO DO, SO LEARN HOW TO:
Understand the concept of scaling traffic beyond a single Juniper SRX firewall.
Articulate the difference between ECMP and Filter-based Forwarding (FBF).
Understand the use cases that drive the requirements for ECMP or FBF.
Perform per-flow load balancing in the master instance while preserving the per-prefix load
balancing within routing instances.
Configure static routes and qualified next-hops that use BFD for liveness detection.
Understand how hash calculations in the Forwarding Information Base (FIB) can impact your
network.
Juniper Networks Books are singularly focused on network productivity and efficiency. Peruse the
complete library at www.juniper.net/books.
ISBN 978-1936779468
51600
© 2012 by Juniper Networks, Inc. All rights reserved. About the Author
Douglas Richard Hanks Jr. is a Senior Systems Engineer
Juniper Networks, the Juniper Networks logo, Junos, with Juniper Networks. He is certified in Juniper
NetScreen, and ScreenOS are registered trademarks of Networks as JNCIE-ENT #213 and JNCIE-SP #875.
Juniper Networks, Inc. in the United States and other Douglas’ interests are network engineering and
countries. Junose is a trademark of Juniper Networks, architecture for both Enterprise and Service Provider
Inc. All other trademarks, service marks, registered routing and switching.
trademarks, or registered service marks are the property
of their respective owners. Author’s Acknowledgments
Thanks, Dad. This book is for you.
Juniper Networks assumes no responsibility for any
inaccuracies in this document. Juniper Networks reserves
the right to change, modify, transfer, or otherwise revise
this publication without notice. Products made or sold by
Juniper Networks or components thereof might be
covered by one or more of the following patents that are
owned by or licensed to Juniper Networks: U.S. Patent
Nos. 5,473,599, 5,905,725, 5,909,440, 6,192,051,
6,333,650, 6,359,479, 6,406,312, 6,429,706,
6,459,579, 6,493,347, 6,538,518, 6,538,899,
6,552,918, 6,567,902, 6,578,186, and 6,590,785. ISBN: 978-1-936779-46-8 (print)
Printed in the USA by Vervante Corporation.
Published by Juniper Networks Books
Author: Douglas Hanks Jr. ISBN: 978-1-936779-47-5 (ebook)
Technical Reviewers:
Stefan Fouant, Juniper Networks Version History: v1 April 2012
Dathen Allen, Juniper Networks 2 3 4 5 6 7 8 9 10 #7100153-en
Daniel Sullivan, Zynga
Artur Makutunowicz <artur@makutunowicz.net> This book is available in a variety of formats at:
Justin Smith, Armature Systems www.juniper.net/dayone.
Editor in Chief: Patrick Ames
Editor and Proofer: Nancy Koerbel Send your suggestions, comments, and critiques by email
J-Net Community Manager: Julie Wider to dayone@juniper.net.
iii
NOTE To perform stateful traffic load generation, this book leveraged IXIA
IxLoad hardware and software; all measurements and reports were
generated using this tool.
or 25% of the total traffic. If one of the SRXs were to fail, only a subset
of the traffic would be impacted and need to be redirected to another
firewall.
Another benefit to using multiple SRXs is that you’re able to tightly
control what traffic flows through which firewall. For example, you
can split all egress HTTP traffic across all four standalone firewalls,
but a special subset of egress traffic can be directed off to a dedicated
SRX cluster for maximum redundancy. In effect, you can create and
control Service Level Agreements (SLA) with a pool of firewalls that
have specific functions for performance or redundancy.
Douglas Hanks Jr., April 2012
Foreword
levels that can scale to meet the needs of even the largest organizations.
Two different approaches are outlined, giving the network architect
multiple options for load-balancing traffic while addressing the
concerns regarding physical placement of such devices at the same
time. The approaches outlined in this guide give network architects
new options to address the virtualization of security services, showing
how the strict placement of firewall devices is no longer required to
achieve security at scale.
Furthermore, what makes this Day One book so invaluable is that it is
backed up by proven research – not only has Doug covered the theo-
retical aspects of these different design approaches in addition to
providing the required configurations – he backs it all up with testing
using traffic generators to gauge latency and other performance
metrics under steady-state and failure scenarios.
For those who are working in large scale data center networks, this
Day One book will prove to be an invaluable asset covering aspects
that have been largely ignored by much of the literature today. But it’s
a must have for network architects or designers responsible for
building out large-scale data center networks. Doug’s expertise and his
clear writing elucidate a complex subject and distill it in a way that is
easy to digest and understand.
Stefan Fouant
April 2012, Ashburn, Virginia
Stefan Fouant is a technical trainer at Juniper Networks and has helped hundreds of
engineers earn their certifications. He is JNCIE-SEC, JNCIE-SP, JNCIE-ER, and JNCI.
Chapter 1
The Challenge
The Challenge. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
10 Day One: Scaling Beyond a Single Juniper SRX in the Data Center
NOTE There are many different types of Stage 1 / DNS solutions on the
market such as managed DNS, geographically-aware DNS, anycast,
and many others. To keep things simple, let’s just use DNS round-robin
in this example.
The Challenge
The more interesting challenge is providing high-scale firewall services
for egress Internet traffic where the destination of the traffic isn’t
known in advance. Traditional methods such as DNS round-robin and
traditional load balancing do not apply to this type of traffic pattern
and require a different approach.
The first problem is that the rate of Traffic0 exceeds the capacity of a
single Juniper SRX. What’s required is a method to take a large stream
of traffic and break it out deterministically into multiple outputs that
are mapped to a particular Juniper SRX. This can be accomplished
with demultiplexing and multiplexing as shown in Figure 1.2.
!
Figure 1.2 Illustration of a Demultiplexer and Multiplexer
Figure 1.3 Data Center with 20,000 Servers and 100Gbps IMIX Egress Internet Traffic
An Alternative Approach
One alternative approach would be to break the ratio of one firewall
per POD, and associate the number of firewalls to the amount of
aggregate egress traffic. For example, let’s assume that one SRX5800
can provide stateful processing of 47.5Gbps of IMIX traffic. If the
amount of egress Internet traffic is 100Gbps, this would only require
three SRX5800 firewalls which would provide a pool of firewall
resources that’s capable of over 140Gbps, like in Figure 1.4. In this
specific example, you can reduce the total amount of firewalls by 25%.
Figure 1.4 An Alternative Approach to Scale the Firewall Performance to the Total
Bandwidth Aggregate
Summary
When there’s a requirement to provide firewall services to traffic at a
large-scale, you must consider the behavior of the traffic and what’s
known in advance. When looking at ingress traffic the destination
network is always known. Using the destination address it’s possible to
use traditional methods such as DNS round-robin and load balancers
to break the traffic down into manageable streams and apply firewall
services.
Egress Internet traffic has different characteristics. Generally the
destination network isn’t known in advance, because the number of
routable addresses in the Internet is very large. However, what is
known in advance is the source address, because the egress Internet
traffic in a data center is originated locally.
Chapter 1: The Challenge 15
policy, both POD-1 and POD-2 will require two firewalls each. To
make the math easy, let’s assume that each SRX5800 costs $100,000
each. POD-1 only has 50Gbps of traffic, which costs $4,000 per 1Gbps
to secure. However POD-2 requires 80Gbps of traffic, which costs
about $2,500 per 1Gbps to secure. Between POD-1 and POD-2 the
average cost per 1Gbps is $3,250.
An alternative approach would be to remove the firewalls from the
PODs and instead pool the firewalls together higher in the network
architecture. This creates a common pool of firewall services that can
be consumed by any POD. Using the same numbers from before,
POD-1 and POD-2 require a combined 130Gbps of egress traffic. This
only requires three SRX5800 firewalls. This represents about $2,300
to secure 1Gbps of traffic compared to the previous $3,250 to secure
traffic on a per POD basis.
The centralized architecture is more efficient because the distribution is
uniform. And, as the traffic requirements increase, it will increase
evenly across all firewalls in the pool. There are, however, some
drawbacks to a centralized approach, and the details and caveats will
be openly discussed at length in subsequent chapters.
Let’s get the test bed working, so we can see for ourselves.
Chapter 2
Physical Topology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Layer 2 Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Layer 3 Topology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
IS-IS Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
BGP Configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Traffic Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Return Traffic. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
18 Day One: Scaling Beyond a Single Juniper SRX in the Data Center
This chapter introduces the test bed used to verify the centralized
firewall architecture explained in Chapter 1. Its goals are to verify that
the architecture is functional, what conditions will trigger a failure,
and how the traffic is impacted during a failure scenario. Several
components are needed:
Border router
Core switch
Aggregation switch
Firewalls
A stateful testing and load device
NOTE The number of components in the book’s topology have been scaled
back since, as noted above, the only goal is to test the functionality and
failure conditions of the centralized firewall architecture.
Physical Topology
The physical topology is comprised of nine devices: (4) SRX5800s, (2)
EX4500s, (1) EX8208, (1) MX240, and (1) IXIA chassis running
IxLoad for testing. All devices are connected with 2x10GE connections
running IEEE 802.3ad. Figure 2.1 shows the actual physical topology
used for testing in this book.
To keep the topology simple, all redundant devices have been removed,
except where absolutely necessary in order to demonstrate the func-
tionality of centralized firewall architecture. There is a single border
router, an aggregation switch, and a core switch. However the focus of
the testing revolves around the firewall pool, so (4) SRX5800s have
been included into the topology.
Chapter 2: The Test Bed 19
!
Figure 2.1 The Physical Topology
20 Day One: Scaling Beyond a Single Juniper SRX in the Data Center
Aggregation
Every device connects into the EX4500-VC, which is acting as the
aggregation switch. This switch isn’t required, but it provides a nice
mechanism to simplify the topology and provide 10GE port density.
NOTE This book does not debate whether Layer 2 or Layer 3 is a better
mechanism for horizontal connectivity, and leaves it as an exercise for
the reader.
Border
The role of the border router is to provide network connectivity to
upstream transit providers; in the physical topology this is the MX240.
It’s connected into the EX4500-VC via 2x10GE ports using IEEE
802.3ad.
There are also 2x10GE ports that are connected to the IXIA; however
these are configured as Layer 2 access ports and use an irb interface to
peer with IXIA.
Firewalls
This book uses (4) SRX5800 in the physical topology. No clustering
will be used in this topology and the firewalls will be operated in
standalone mode. Each firewall is connected into the aggregation
switch via IEEE 802.3ad and IEEE 802.1Q.
Core
The core switch in the topology is represented by the Juniper EX8200.
Its role is to provide both Layer 2 and Layer 3 services. It plays a
critical role in the demultiplexing of egress traffic and the decision
process of how traffic is mapped to firewalls.
Two 10GE ports are connected via IEEE 802.3ad to the aggregation
switch. These ports are configured as a Layer 2 trunk port and peer
with other devices via an irb interface.
22 Day One: Scaling Beyond a Single Juniper SRX in the Data Center
Test Device
The testing device is an IXIA chassis running the IxLoad software.
Although the topology shows two separate testing devices, in reality it
was a single chassis with multiple ports connecting into the topology;
two ports on the top acting as an HTTP Server and two ports on the
bottom acting as an HTTP Client.
Layer 2 Topology
Four major VLANs are used in this book’s topology: IXIA, TRUST,
UNTRUST, and DC. Each VLAN represents a logical separation in the
network and partitions each device by function and responsibility. The
TRUST and UNTRUST VLANs are specifically designed to work well
with the Juniper SRX security zone architecture. Figure 2.6 illustrates
the Layer 2 topology.
IXIA VLAN
The IXIA VLAN only exists on two ports on the MX240. This allows
the IXIA device to have two physical ports connected into the MX240
and to use the same network subnet on each physical port.
Let’s take a look at the interface configuration on the MX240 for the
IXIA VLAN:
interfaces {
xe-0/2/0 {
encapsulation ethernet-bridge;
unit 0 {
family bridge;
}
}
xe-0/3/0 {
encapsulation ethernet-bridge;
unit 0 {
family bridge;
}
}
irb {
unit 100 {
family inet {
address 10.7.7.1/24;
}
}
}
}
Chapter 2: The Test Bed 23
NOTE If you’re new to bridging on the Juniper MX, be sure to check out a
forthcoming book to be published by O’Reilly Media in Q3 of 2012:
Juniper MX Series: A Practical Guide to Trio Technologies.
The IXIA VLAN is configured with a vlan-id of 100 and includes the
xe-0/2/0 and xe-0/3/0 interfaces. The routed interface is assigned to
irb.100, which has the IP address 10.7.7.1/24.
TRUST VLAN
The TRUST VLAN is defined on both the EX8200 and EX4500-VC.
Each port on the EX4500-VC that connects to the firewalls and the
EX8200 is configured as either an access or a trunk port. The general
idea is that devices behind the firewalls are trusted and anything
beyond the firewalls towards the MX240 is untrusted.
Figure 2.7 Logical Illustration of the Juniper SRX Firewall Sitting Between the UNTRUST
and TRUST VLANs
UNTRUST VLAN
The UNTRUST VLAN is also defined on the EX4500-VC and extends
to the other arm of the firewalls and finally to the MX240. Because the
firewalls are running IEEE 802.1Q, they can have two logical inter-
faces each on the TRUST and UNTRUST VLANs. Figure 2.7 illus-
trates that as data passes through the firewalls, it will flow from
TRUST to UNTRUST.
DC VLAN
The final VLAN is DC and is defined only on the EX8200. It represents
the rest of the data center that would exist in a real production envi-
ronment. The IXIA HTTP Client is connected into the DC VLAN to
source stateful HTTP traffic, which will ultimately flow through the
firewalls and out the MX240 to the IXIA HTTP Server.
Let’s view the VLAN configuration on the EX8200:
vlans {
DC {
vlan-id 1000;
l3-interface vlan.1000;
}
}
interfaces {
vlan {
unit 1000 {
family inet {
address 192.168.1.1/24;
}
family iso;
}
}
}
Layer 3 Topology
There are four major networks defined in this topology that build on
top of the Layer 2 VLAN structure: IXIA Server, Untrust, Trust, and
Data Center.
26 Day One: Scaling Beyond a Single Juniper SRX in the Data Center
10.7.7/24
The 10.7.7/24 network sits between the IXIA Server and the MX240.
The MX240 has an IP address of 10.7.7.1/24 while the IXIA Server
uses two IP addresses, 10.7.7.2/24 and 10.7.7.3/24. Let’s take a look at
the interface and bridge domain configuration for the MX240:
interfaces {
xe-0/2/0 {
encapsulation ethernet-bridge;
unit 0 {
family bridge;
}
}
xe-0/3/0 {
encapsulation ethernet-bridge;
unit 0 {
family bridge;
}
}
irb {
unit 100 {
family inet {
address 10.7.7.1/24;
}
}
}
}
bridge-domains {
IXIA {
vlan-id 100;
interface xe-0/2/0.0;
interface xe-0/3/0.0;
routing-interface irb.100;
}
}
10.3/24
In Junos, 10.3/24 is valid shorthand for the network 10.3.0.0/24. This
network maps directly to the UNTRUST VLAN. All of the firewalls
and the MX240 use this network for reachability. The MX240 is
assigned the IP address 10.3.0.1/24, while the firewalls are assigned
10.3.0.11/24 through 10.3.0.14/24, respectively. Let’s take a look at
the ae0 interface on the first firewall SRX-1:
interfaces {
ae0 {
vlan-tagging;
aggregated-ether-options {
lacp {
periodic fast;
}
}
unit 200 {
vlan-id 200;
family inet {
address 10.2.0.11/24;
}
family iso;
}
unit 300 {
vlan-id 300;
family inet {
address 10.3.0.11/24;
}
family iso;
}
}
}
Chapter 2: The Test Bed 29
10.2/24
This network is very similar to 10.3/24, but provides connectivity, yet
again, for all of the devices in the TRUST VLAN, including the
firewalls since they’re using IEEE 802.1Q and have two logical inter-
faces. The firewall IP addresses are 10.2.0.11/24 through 10.2.0.14/24,
respectively. Since the EX8200 is part of the TRUST VLAN it partici-
pates in this network with the IP address of 10.2.0.10/24.
192.168.1/24
This network represents the data center where the test traffic will be
originated. Only the EX8200 and IXIA Client sit on this network. The
EX8200 has a Layer 3 interface with an IP address of 192.168.1.1/24,
which acts as a default gateway for the IXIA Client.
Loopback Addresses
Each device has its own loopback address that’s used for reachability
and in some cases for IBGP peering, as listed in Table 2.1.
Chapter 2: The Test Bed 31
MX240 10.3.255.10/32
SRX-1 10.3.255.11/32
SRX-2 10.3.255.12/32
SRX-3 10.3.255.13/32
SRX-4 10.3.255.14/32
EX8200 10.2.255.10/32
The loopback addresses try and use the same network address assign-
ments. For example, the MX240 has a loopback address of
10.3.255.10/32 which has the same first two octets of 10.3 belonging
to devices in the UNTRUST VLAN and the last octet of .10, which
matches the last octet of its 10.3.0.10/24 IP address. This makes the
loopback address easy to remember without having to return to this
chapter as a reference.
IS-IS Configuration
The topology this book has elected to use is IS-IS as the Interior
Gateway Protocol (IGP). OSPF has been beaten to death and it’s
always a good idea to mix it up.
To keep things simple, all devices share the same IS-IS area 49.0000 as
shown in Figure 2-12. To further reduce complexity, all of the interface
adjacencies are Level 2 only. Notice that only devices with a direct
connection have adjacency with each other. For example, the MX240
and EX8200 only have an IS-IS adjacency with the firewalls, however,
the firewalls have an IS-IS adjacency with every single device.
Although there are some devices that do not have a full mesh of IS-IS
adjacencies, each device has a complete route table with connectivity
to all networks and loopback addresses in this network.
32 Day One: Scaling Beyond a Single Juniper SRX in the Data Center
jnpr@SRX-1> show isis adjacency
Interface System L State Hold (secs) SNPA
ae0.200 SRX-2 2 Up 20 0:1f:12:f1:ff:c0
ae0.200 SRX-3 2 Up 24 0:1f:12:f6:ef:c0
ae0.200 SRX-4 2 Up 23 0:1f:12:fa:f:c0
ae0.200 EX8208-SW1-RE0 2 Up 6 0:22:83:6a:32:1
ae0.300 MX240-RE0 2 Up 8 0:1f:12:b7:77:c0
ae0.300 SRX-2 2 Up 22 0:1f:12:f1:ff:c0
ae0.300 SRX-3 2 Up 25 0:1f:12:f6:ef:c0
ae0.300 SRX-4 2 Up 19 0:1f:12:fa:f:c0
Let’s verify that IS-IS is properly installing prefixes into the routing
table with the show route protocol isis command:
jnpr@SRX-1> show route protocol isis
inet.0: 15 destinations, 15 routes (15 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both
10.2.255.10/32 *[IS-IS/18] 01:03:33, metric 10
> to 10.2.0.253 via ae0.200
10.3.255.10/32 *[IS-IS/18] 01:03:33, metric 10
> to 10.3.0.1 via ae0.300
10.3.255.12/32 *[IS-IS/18] 01:03:33, metric 10
> to 10.3.0.12 via ae0.300
to 10.2.0.12 via ae0.200
10.3.255.13/32 *[IS-IS/18] 01:03:33, metric 10
to 10.3.0.13 via ae0.300
> to 10.2.0.13 via ae0.200
10.3.255.14/32 *[IS-IS/18] 01:03:33, metric 10
> to 10.3.0.14 via ae0.300
to 10.2.0.14 via ae0.200
192.168.1.0/24 *[IS-IS/18] 01:03:33, metric 20
> to 10.2.0.253 via ae0.200
In this book, each device participating in IS-IS will use BFD for liveness
detection. For sub-second detection, an interval of 300ms will be used
with a multiplier of 3, meaning that if three hellos are missed, the
neighbor is considered down.
Let’s take a look at the BFD configuration of EX8200:
protocols {
isis {
level 1 disable;
interface lo0.0 {
passive;
}
interface vlan.200 {
bfd-liveness-detection {
minimum-interval 300;
multiplier 3;
}
}
interface vlan.1000;
}
}
4 sessions, 12 clients
Cumulative transmit rate 13.3 pps, cumulative receive rate 13.3 pps
The EX8200 was able to see all of the other firewalls on the interface
vlan.200. Multiplying the multiplier times the minimum-interval
results in the Detect Time of 0.900 seconds.
BFD becomes especially important when two devices are peering over
a bridge such as the EX8200 and the Juniper SRX firewalls. Recall that
all devices are physically cabled to the EX4500-VC. Imagine if SRX-1
had an interface failure. From the vantage point of the EX8200,
everything is still operational, because its interface – which is connect-
ed to the EX4500-VC – is still up. In this example, BFD would trigger
IS-IS that the SRX-1 is down in 900ms.
36 Day One: Scaling Beyond a Single Juniper SRX in the Data Center
BGP Configuration
At this point you should have a very good understanding of how the
topology is connected and how devices are able to communicate. Let’s
move on to the real meat of the topology and take a look at BGP. BGP is
the glue that brings everything together in terms of being able to route
through the topology between the IXIA Client and IXIA Server.
Traffic will be sourced from the IXIA Client, which is sitting on the
network 192.168.1/24, and the packets will be destined to 10.7.7.2 and
10.7.7.3. The only problem is that the next hop router for the IXIA
Client has no idea how to reach 10.7.7.2 or 10.7.7.3. This is where BGP
comes in – a default route will be originated from the MX240 and
propagated throughout the topology.
!
Figure 2.13 BGP Within the Topology
NOTE To solve the IBGP split horizon problem, EBGP is used between the
EX8200 and the firewalls instead of configuring a BGP route reflector.
Creating a full mesh is another option, but an IBGP connection between
the EX8200 and MX240 would defeat the purpose of the firewalls.
Chapter 2: The Test Bed 37
MX240
The advertisement of prefixes throughout the network is very simple. It
all begins at the MX240 with a default route:
routing-options {
static {
route 0.0.0.0/0 discard;
}
autonomous-system 4567;
}
But a static default route that discards all packets isn’t enough. There
needs to be a policy statement that exports this default route to the
firewalls via IBGP:
policy-options {
policy-statement export-default {
term 1 {
from {
protocol static;
route-filter 0.0.0.0/0 exact;
}
then {
accept;
}
}
term 2 {
then reject;
}
}
}
This policy will find the static route 0/0, accept it, and reject all other
prefixes. The next step is to configure BGP on the MX240 and refer-
ence the policy-statement export-default:
protocols {
bgp {
group SRX {
type internal;
local-address 10.3.255.10;
export export-default;
multipath;
neighbor 10.3.255.11;
neighbor 10.3.255.12;
neighbor 10.3.255.13;
neighbor 10.3.255.14;
}
}
This BGP group is responsible for peering with the firewalls SRX-1
through SRX-4. Because all of the devices share the same ASN, the
peering type is IBGP. It’s considered best practice to use loopback
38 Day One: Scaling Beyond a Single Juniper SRX in the Data Center
addressing when configuring IBGP – the MX240 will use its loopback
address 10.3.255.10 as the local-address and use the loopback address
of each firewall as the neighbor address.
Now that a 0/0 static route exists in the RIB, and there’s a policy to find
0/0 and accept the prefix, it needs to be applied to the BGP configura-
tion so the MX240 can advertise it to the firewalls. This is done with
the export export-default configuration.
Firewalls
The firewalls have the most interesting BGP configuration. Since the
entire point of this case study is to force traffic through the firewalls,
two different BGP groups on the SRX are needed: TRUST and UN-
TRUST.
protocols {
bgp {
group UNTRUST {
type internal;
local-address 10.3.255.11;
neighbor 10.3.255.10;
}
group TRUST {
type external;
multihop;
local-address 10.3.255.11;
peer-as 1234;
neighbor 10.2.255.10;
}
}
}
UNTRUST
The BGP group UNTRUST is used to peer to the MX240 via IBGP. No
policy, import or export is required. According to best practices, the
loopback addresses are used to establish connectivity between the
firewalls and MX240.
TRUST
When peering with the EX8200, EBGP is used, since the EX8200 is in
a different ASN. Since loopback peering was used with the MX240, it
was used again when peering with the EX8200, although it’s EBGP.
When using loopback peering with EBGP, the option multihop is
required because the default time to live (TTL) for EBGP is 1.
Chapter 2: The Test Bed 39
Default Route
Note that neither the BGP group UNTRUST or TRUST uses an import
or export policy. The firewalls are just using the default BGP rules
when advertising and accepting prefixes. Since the MX240 is advertis-
ing a 0/0 route to the firewalls, the firewalls advertise the 0/0 route to
the EX8200 in return. Let’s verify this behavior:
jnpr@SRX-1> show bgp summary
Groups: 2 Peers: 2 Down peers: 0
Table Tot Paths Act Paths Suppressed History Damp State Pending
inet.0 1 1 0 0 0 0
Peer AS InPkt OutPkt OutQ Flaps Last Up/Dwn State|#Active/
Received/Accepted/Damped...
10.2.255.10 1234 148 159 0 1 1:03:05 0/0/0/0
0/0/0/0
10.3.255.10 4567 149 156 0 1 1:02:57 1/1/1/0
0/0/0/0
EX8200
The EX8200 peers with all four firewalls via EBGP using loopback
peering. Recall that the EX8200 ASN is 1234 and the ASN of the four
firewalls is 4567. Because of the loopback peering with EBGP, the
multihop option needs to be used to increase the default TTL of 1:
protocols {
bgp {
group SRX {
type external;
multihop;
local-address 10.2.255.10;
peer-as 4567;
multipath;
neighbor 10.3.255.11;
neighbor 10.3.255.12;
40 Day One: Scaling Beyond a Single Juniper SRX in the Data Center
neighbor 10.3.255.13;
neighbor 10.3.255.14;
}
}
}
Let’s take a look at the show bgp summary command to verify that
everything has come up properly:
jnpr@EX8208-SW1-RE0> show bgp summary
Groups: 1 Peers: 4 Down peers: 0
Table Tot Paths Act Paths Suppressed History Damp State Pending
inet.0 4 4 0 0 0 0
Peer AS InPkt OutPkt OutQ Flaps Last Up/
Dwn State|#Active/Received/Accepted/Damped...
10.3.255.11 4567 139 137 0 13 1:01:09 Establ
inet.0: 1/1/1/0
10.3.255.12 4567 10216 10187 0 7 3d 4:52:42 Establ
inet.0: 1/1/1/0
10.3.255.13 4567 10221 10186 0 13 3d 4:52:37 Establ
inet.0: 1/1/1/0
10.3.255.14 4567 1386 1379 0 19 10:23:47 Establ
inet.0: 1/1/1/0
inet.0: 16 destinations, 19 routes (16 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both
0.0.0.0/0 *[BGP/170] 3d 04:52:49, localpref 100, from 10.3.255.12
AS path: 4567 I
> to 10.2.0.11 via vlan.200
to 10.2.0.12 via vlan.200
to 10.2.0.13 via vlan.200
to 10.2.0.14 via vlan.200
[BGP/170] 01:01:08, localpref 100, from 10.3.255.11
AS path: 4567 I
> to 10.2.0.11 via vlan.200
[BGP/170] 3d 04:52:44, localpref 100, from 10.3.255.13
AS path: 4567 I
> to 10.2.0.13 via vlan.200
[BGP/170] 10:23:54, localpref 100, from 10.3.255.14
AS path: 4567 I
> to 10.2.0.14 via vlan.200
The policy statement lb simply matches all traffic and changes the
load-balance option to be per-packet.
NOTE Keep in mind that when changing the load-balance option to per-pack-
et, it isn’t really per-packet, but it is per-flow. Junos shows per-packet
simply because of historical reasons.
And the second part of changing the default load balancing algorithm
is to apply this policy to the FIB. To do so, set the forwarding-table
export to reference the lb policy.
Before committing this configuration, let’s compare the FIB before and
after the change.
BEFORE
jnpr@EX8208-SW1-RE0> show route forwarding-table destination 0/0
Routing table: default.inet
Internet:
Destination Type RtRef Next hop Type Index NhRef Netif
default user 0 ulst 131078 1
indr 131074 2
0:1f:12:f2:7f:c0 ucst 1332 7 vlan.200
indr 131076 2
default perm 0 rjct 36 1
0.0.0.0/32 perm 0 dscd 34 1
Chapter 2: The Test Bed 43
AFTER
jnpr@EX8208-SW1-RE0> show route forwarding-table destination 0/0
Routing table: default.inet
Internet:
Destination Type RtRef Next hop Type Index NhRef Netif
default user 0 ulst 131078 1
indr 131074 2
0:1f:12:f2:7f:c0 ucst 1332 7 vlan.200
indr 131076 2
0:1f:12:f1:ff:c0 ucst 1334 7 vlan.200
indr 131075 2
0:1f:12:f6:ef:c0 ucst 1338 7 vlan.200
indr 131070 2
0:1f:12:fa:f:c0 ucst 1324 7 vlan.200
default perm 0 rjct 36 1
0.0.0.0/32 perm 0 dscd 34 1
Note that before the FIB export change, the next hop in the FIB for the
destination address 0/0 was a single MAC address ending in 7f:c0.
After the FIB export change was applied, the next hop for 0/0 has
changed. There are now four next hops pointing to four different
MAC addresses ending in 7f:c0, ff:c0, ef:c0, and 0f:c0.
Now, any traffic that’s taking the 0/0 route in the RIB will be hashed
per-flow in the FIB and have close to uniform distribution across all
four next hops.
Default Route
As the EX8200 serves as the default gateway for the IXIA Client, all
traffic sourced from the IXIA Client needs to have a valid route on the
EX8200. The IXIA Client has been configured to source traffic from
192.168.1/24 and sends it to 10.7.7.2 and 10.7.7.3. Let’s take a look at
the EX8200, and make sure that there is a valid route:
jnpr@EX8208-SW1-RE0> show route 10.7.7.2
inet.0: 16 destinations, 19 routes (16 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both
0.0.0.0/0 *[BGP/170] 3d 04:52:49, localpref 100, from 10.3.255.12
AS path: 4567 I
> to 10.2.0.11 via vlan.200
to 10.2.0.12 via vlan.200
to 10.2.0.13 via vlan.200
to 10.2.0.14 via vlan.200
Perfect; the traffic destined for 10.7.7.2 is hitting the default route on
the EX8200, which in turn will be forwarded uniformly to the SRX
firewalls, and ultimately up to the MX240 to reach the final destina-
tion of the IXIA Server.
44 Day One: Scaling Beyond a Single Juniper SRX in the Data Center
Traffic Flow
Up till now, this chapter has covered the physical and logical topology
of the case study. Layered on top of the topology are the routing
protocols IS-IS and BGP to enable reachability within the topology and
allow the IXIA Client to reach the IXIA Server. Let’s take a moment
and review an example traffic flow sourced at the IXIA Client and
destined to the IXIA Server as depicted in Figure 2.14.
vv
!
Figure 2.14 Traffic Flow from IXIA Client to IXIA Server
The traffic in Figure 2.14 flows from the left to the right. Let’s walk
through the entire process to fully understand how each device routes
the packet:
1. IXIA Client generates a packet with a source of 192.168.1.100 and
a destination of 10.7.7.2.
2. IXIA Client forwards the packet to its default gateway 192.168.1.1.
3. EX8200 receives the packet.
4. EX8200 performs a route lookup for 10.7.7.2 and matches its 0/0
route.
Chapter 2: The Test Bed 45
5. EX8200 chooses one of the four next hops for 0/0 and in this
example forwards the packet to 10.3.255.11.
6. SRX-1 receives the packet (this example will skip the security
processing for now).
7. SRX-1 performs a route lookup for 10.7.7.2 and matches its 0/0
route.
8. SRX-1 forwards the packet to 10.3.255.10.
9. MX240 receives the packet.
10. MX240 performs a route lookup for 10.7.7.2.
11. MX240 has a direct interface with 10.7.7.1/24 and forwards it out
of this interface.
12. IXIA Server receives the packet.
Of course, this is only half of the picture. What’s shown in Figure 2.14
is only the egress traffic destined to the IXIA Server. What’s missing is
the return traffic that’s destined to the IXIA Client.
Return Traffic
The previous sections in this chapter provided a glimpse into how the
IXIA Client is able to send traffic out to the IXIA Server. Let’s go back
to the root of the challenge, and focus on how to demultiplex a large
stream of traffic statefully and distribute flows uniformly to multiple
SRX Series firewalls.
!
Figure 2.15 Demultiplexing and Multiplexing Traffic
46 Day One: Scaling Beyond a Single Juniper SRX in the Data Center
source-address 192.168.0.0/16;
}
then {
source-nat {
pool {
test-snat;
}
}
}
}
}
}
}
policies {
from-zone TRUST to-zone UNTRUST {
policy permit-all {
match {
source-address any;
destination-address any;
application any;
}
then {
permit;
count;
}
}
}
}
}
When applying SNAT to traffic, there are several items that need to be
configured. To permit the traffic, a policy needs to be created to allow
traffic from the TRUST zone to the UNTRUST zone. The next step is
to define the SNAT pool. In this example SRX-1 creates a pool called
test-snat, which contains the address 20.20.31/24. This is true for the
other firewalls SRX-2 through SRX-4 having addresses 20.20.32/24
through 20.20.34/24 respectively. The last step is to create a NAT rule
to match traffic from the TRUST zone going to the UNTRUST zone
with a source address matching 192.168/16.
The last piece of routing information needed is some static routes on
the MX240. Since SRX-1 through SRX-4 will be using a SNAT pool of
20.20.31/24 through 20.20.34/24, respectively, the MX240 needs to
know how to reach these prefixes:
routing-options {
static {
route 20.20.31.0/24 next-hop 10.3.0.11;
route 20.20.32.0/24 next-hop 10.3.0.12;
route 20.20.33.0/24 next-hop 10.3.0.13;
route 20.20.34.0/24 next-hop 10.3.0.14;
}
}
48 Day One: Scaling Beyond a Single Juniper SRX in the Data Center
Armed with this new information, let’s walk through the entire flow of
egress and ingress traffic from IXIA Client to IXIA Server as shown in
Figure 2.16.
!
Figure 2.16 Egress and Ingress Traffic
13. IXIA Server processes the packet and responds. IXIA Server
performs a route lookup for 20.20.31.1 and sends the packet out its
default gateway of 10.7.7.1.
14. IXIA Server forwards the packet to the MX240.
15. MX240 receives the packet.
16. MX240 performs a route lookup for 20.20.31.1 and finds a static
route for 20.20.31/24 pointing to 10.3.255.11.
17. MX240 forwards the packet to 10.3.255.11.
18. SRX-1 receives the packet.
19. SRX-1 identifies the packet as part of an existing session as
explained in step 7. SRX-1 also identifies the packet as part of a SNAT
rule. SRX-1 reverts the destination address to the original source
address of 192.168.1.100 as described in step 7. SRX-1 performs a
route lookup for 192.168.1.100 and sees an IS-IS route for
192.168.1/24 pointing to the EX8200.
20. SRX-1 forwards the packet to 10.3.255.10.
21. EX8200 receives the packet.
22. EX8200 performs a route lookup for 192.168.1.100 and finds a
directly connected network of 192.168.1/24.
23. EX8200 forwards the packet to IXIA Client.
24. IXIA Client receives the packet.
This may seem like a lot of steps, but this example has been exagger-
ated to illustrate each function on each device through the packet’s
entire life.
Using a combination of ECMP for egress demux and SNAT to ensure
an invertible demux on the return traffic is the key to this architecture.
It’s all about breaking down a complex problem into simple building
blocks.
NOTE The author realizes that using a destination address of 10.7.7/24 isn’t
routable on the Internet and I should have used an address range that
isn’t part of RFC 1918. Please accept my apologies. Being pedantic
aside, using a private address doesn’t impact the functionality of the
case study.
50 Day One: Scaling Beyond a Single Juniper SRX in the Data Center
Summary
From cabling and connecting the devices to setting up VLANs and
configuring routing protocols, this chapter has explained in detail how
the test bed is configured. Sometimes it’s easy to get so caught up in the
details you can’t see the forest for the trees. Let’s take a step back from
the implementation details and review the goals we set at the beginning
of the chapter.
When the amount of traffic exceeds the capacity of a single firewall,
one must break down the traffic into smaller chunks that are able to be
serviced by a single firewall. The traffic is then sent to its final destina-
tion. The next step is to handle the return traffic in the same fashion as
the original egress traffic. Because the egress traffic is subject to a
unique SNAT, the return traffic is always guaranteed to go back to the
firewall from which it came, as shown in Figure 2.17.
There are two flows in Figure 2.17: flow0 and flow1. In this example,
flow0 is mapped to SRX-1 via the FIB load balancing by the EX8200.
Because the SRX-1 has a unique SNAT pool, the return traffic is
guaranteed to be routed back to SRX-1. The same is true for flow1,
except that in this example, the EX8200 FIB has load balanced it to
SRX-4 – thus it’s subject to the unique SNAT pool on SRX-4. The
return traffic for flow1 is then destined back to SRX-4, making the
entire conversation stateful.
ECMP and SNAT are a powerful combination. Such simple tools can
be used to solve complex problems. However ECMP isn’t as simple as
it appears. The next chapter takes a deep dive into ECMP and focuses
on failure conditions.
Chapter 2: The Test Bed 51
ECMP Drawbacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
ECMP Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
ECMP Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
54 Day One: Scaling Beyond a Single Juniper SRX in the Data Center
There are two components in ECMP: the RIB and FIB. Both the RIB
and FIB work in various combinations to provide different types of
behavior in ECMP. As a quick refresher, let’s review the difference
between the RIB and FIB.
inet.0: 16 destinations, 19 routes (16 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both
0.0.0.0/0 *[BGP/170] 3d 04:52:49, localpref 100, from 10.3.255.12
AS path: 14824 I
> to 10.2.0.11 via vlan.200
to 10.2.0.12 via vlan.200
to 10.2.0.13 via vlan.200
to 10.2.0.14 via vlan.200
[BGP/170] 01:01:08, localpref 100, from 10.3.255.11
AS path: 14824 I
> to 10.2.0.11 via vlan.200
[BGP/170] 3d 04:52:44, localpref 100, from 10.3.255.13
AS path: 14824 I
> to 10.2.0.13 via vlan.200
[BGP/170] 10:23:54, localpref 100, from 10.3.255.14
AS path: 14824 I
> to 10.2.0.14 via vlan.200
[Static/200] 3d 04:43:27
> to 10.2.0.11 via vlan.200
The prefix 0/0 has five entries in the RIB: there’s an entry for each of
the EBGP neighbors, 10.3.255.11 through 10.3.255.14. The last entry
for 0/0 is a static route with a preference of 200.
The job of the RIB is to determine which of the available destinations is
considered the best. The RIB has to follow a set of rules, such as:
Match the longest prefix.
Prefer the lowest preference.
If an IGP, prefer the lowest metric.
If BGP, there’s an entire laundry list of over items that need to be
compared to find the best path.
Once the RIB has identified the best destination for a given prefix, it is
pushed down to the FIB. Thus, the FIB only contains the bare essentials
Chapter 3: Equal Cost Multi-Path (ECMP) Routing 55
that are required to forward packets. Let’s review the FIB for the prefix
0/0 and compare it to the RIB:
jnpr@EX8208-SW1-RE0> show route forwarding-table destination 0/0
Routing table: default.inet
Internet:
Destination Type RtRef Next hop Type Index NhRef Netif
default user 0 ulst 131078 1
indr 131074 2
0:1f:12:f2:7f:c0 ucst 1332 7 vlan.200
indr 131076 2
0:1f:12:f1:ff:c0 ucst 1334 7 vlan.200
indr 131075 2
0:1f:12:f6:ef:c0 ucst 1338 7 vlan.200
indr 131070 2
0:1f:12:fa:f:c0 ucst 1324 7 vlan.200
default perm 0 rjct 36 1
0.0.0.0/32 perm 0 dscd 34 1
In this example, the FIB for the prefix 0/0 only has a single entry, which
has considered the best destination by the RIB, but has multiple next
hops. This information is installed into each of the ASICs on the router
or switch’s line cards and is used exclusively for forwarding traffic at
line-rate. Think of the RIB as the brains, and the FIB as the brawn.
Load Balancing
Junos supports two different types of load balancing in the FIB:
per-prefix and per-flow. By default, Junos uses per-prefix load balanc-
ing. If a given set of prefixes share the same set of next hops, each
prefix will increment the next hop, until the last next hop in the set is
reached, then it starts from the beginning again as shown in Figure 3.1.
In Figure 3.1, given that Prefix 1 through Prefix 4 have the destinations
of next-hop 1 through next-hop 2, each prefix will increment next
56 Day One: Scaling Beyond a Single Juniper SRX in the Data Center
DID YOU KNOW? One of the most asked questions about Junos is: Why does Junos have
an option called per-packet if it’s really per-flow? The answer is that
with Juniper’s first router – the M40 – the per-packet actually load
balanced the traffic across multiple next hops on a per-packet basis.
Obviously, this was a problem, as packets tended to arrive out of order
and cause oscillation in the network. It was decided to change this
behavior to be per-flow instead of per-packet. The unfortunate (and
fortunate) part is that the configuration option was left unchanged so
that it wouldn’t break previous customer configurations. Juniper has
had bigger fish to fry since then, but hopefully one day this per-packet
option can be deprecated and replaced with the correctly named
per-flow option instead.
Chapter 3: Equal Cost Multi-Path (ECMP) Routing 57
ECMP Drawbacks
ECMP is great at providing uniform traffic distribution across a set of
next hops and has been used with great success in switches and routers
for a very long time. The hash algorithm used by the FIB to determine a
next hop is deterministic per packet, so it’s very good at providing
per-flow distribution. However, the Achilles’ heel of ECMP is that
when the attributes that feed the hash function change, the output of
the algorithm changes as well. This causes a change in how next hops
are calculated and can vary from a previous calculation. Consider
Figure 3.3.
In Figure 3.3 there are four packets that represent four different flows
as well as four next hops. Let’s assume that each packet is mapped to a
specific next hop, given the current state of the hash function.
Let’s also assume that each next hop represents a different Juniper SRX
firewall. Now imagine that something occurred to cause the link to go
down; this could be the cause of a maintenance window or failure.
Since the link is down, it’s no longer a valid next hop with regard to the
hash function shown in the subsequent Figure 3.4.
Now that next-hop 3 is no longer available, the attributes that feed the
hash algorithm are different, and the next hop calculation has changed.
Assume that the same packets arrive at the hash algorithm again.
Previously, when there were four available next hops, packet 1 was
mapped to next-hop 2, but now that the hash is being calculated
differently due to the change in next hops, the new next hop for packet
1 is next-hop 4.
58 Day One: Scaling Beyond a Single Juniper SRX in the Data Center
This may appear to make sense visually, but how does this work
mathematically? Let’s take a look at a very simple hash algorithm:
hash = mod(p,n)
Here p equals packets, and n equals the number of next hops.
To illustrate this formula, let’s graph ten packets using the same
formula, but three different numbers of next hops:
In Figure 3.5 each shape represents the output of the hash function
given a different bucket size. In this example, the bucket size is analo-
gous to the number of next hops. The X axis represents the packet
number and the Y axis represents the bucket number. For example,
using the function mod(5, 3), packet 5 would be placed into bucket 2,
however, when using the function mod(5, 7), packet 5 would be placed
into bucket 5. The output is listed in Table 3.1.
Table 3.1 Output of: hash=mod(n) where n = 3, 4, and 7
packet 1 2 3 4 5 6 7 8 9 10
mod(3) 1 2 0 1 2 0 1 2 0 1
mod(4) 1 2 3 0 1 2 3 0 1 2
mod(7) 1 2 3 4 5 6 0 1 2 3
The end result is that any topology change in the network can impact
how traffic flows through the FIB. Since firewalls are a stateful device
and keep track of traffic flows, imagine if there were four firewalls,
each with its own session table, then one firewall had a failure, chang-
ing the number of next hops to three. Since the hash will be calculated
Chapter 3: Equal Cost Multi-Path (ECMP) Routing 59
differently, it’s possible that previous flows going to SRX-1 can now be
mapped to SRX-3. What happens in this case is that SRX-3 will receive
the flow midstream, look in its session table and see that there’s no
existing session for the incoming packet, and discard the packet.
NOTE The mod function was used here to provide a simple illustration of
hashing and how the number of next hops can change the output of the
function. Please rest assured that Juniper ASICs used a much more
advanced hashing algorithm. ; )
a firewall is rebooted
a link is disabled for maintenance
... or any other action that would cause the number of firewall next
hops to change on the EX8200 is undertaken, it’s extremely probable
that a subset of flows will be mapped to a new next hop. The end result
causes a subset of the traffic to be discarded by the new firewall, as it
will not accept traffic without first having a session in its session table.
ECMP is only suitable for traffic that is short-lived and is able to
recover from errors by initializing new connections.
ECMP Testing
Armed with the knowledge of how the FIB hashing algorithm can
impact traffic, the next logical step is to test how a topology change
impacts real traffic. The goal here is to generate stateful traffic sourced
from an IXIA Client destined to the IXIA Server and measure the
impact of a FIB change.
It’s expected that during a topology change any existing concurrent
connections would be dropped. This is because existing flows would be
remapped to different next hops and intercepted by different firewalls
that have no knowledge of the ingress packet.
Key Value
Protocol TCP
Application HTTP
CPS 50,000
Chapter 3: Equal Cost Multi-Path (ECMP) Routing 61
Test Objective
When the test was started the CPS ramped up to 50,000 within about
10 seconds. Each Juniper SRX firewall received about 12,500 CPS
each as the traffic was distributed uniformly across all four next hops.
The EX8200 was configured with ECMP and per-packet (read:
per-flow) FIB load balancing.
Around 160 seconds into the test the author disabled the interface ae0
on SRX-1, causing the FIB on the EX8200 to go from four next hops
down to three. The CPS instantly dropped down to around 38,000 and
quickly ramped back up to 50,000 within 10 seconds, as shown in
Figure 3.6.
What was the real impact, though? Let’s take a look at the active
number of HTTP sessions shown in Figure 3.7.
Figure 3.7 HTTP Transactions That are Active for ECMP Test
62 Day One: Scaling Beyond a Single Juniper SRX in the Data Center
Because the four Juniper SRX firewalls have been configured to discard
traffic that isn’t already in the session table, what we can expect to see
is a flurry of TCP Retries from the IXIA Client during the EX8200 FIB
change.
As predicted, there are a large number of TCP Retries on the IXIA
Client. Because the IXIA Client wasn’t receiving a TCP ACK from the
IXIA Server, it kept attempting to retry until the socket was timed out.
The real impact during the exact moment of the EX8200 FIB change
was 19 sockets. This can be verified by looking at the number of IXIA
Client TCP RST packets sent. The IXIA Client never received TCP
ACKs from the IXIA Server, so it eventually timed out, and instead of
closing the socket with a TCP FIN, it sent a TCP RST packet. Notice
that the graph shows about 12 active HTTP connections at the time of
failure, but the report indicated that 12 active HTTP connections were
dropped. The graph isn’t 100% accurate because it averages the data,
but what can be concluded is that there were at least 12 active HTTP
sessions across all four firewalls according to the graph, and the
detailed report confirms that exactly 19 were impacted because of the
Chapter 3: Equal Cost Multi-Path (ECMP) Routing 63
failure. This clearly shows that all firewalls are impacted equally
during a failure scenario using ECMP, because there’s only a single
failure domain in this architecture.
Figure 3.9 TCP Connection Totals for Client and Server – ECMP Testing
Over the five minute duration of the IXIA test, nearly 15,000,000
packets were sent and received, as shown in Figure 3.9. Drilling down
into the details of the number of TCP SYN packets sent and received,
we can see there is a delta of 274 TCP SYN packets missing. Given that
the IXIA Client was sending out TCP SYN packets at a rate of 50,000
per second, it’s calculated that during the EX8200 FIB change that 5ms
of traffic was dropped on the disabled interface going to SRX-1.
NOTE The EX8200 FIB change caused 5ms worth of traffic to be dropped in
this specific topology and test configuration. This number is specific to
this case study and will differ in your network.
ECMP Conclusions
Using ECMP as a demux is an efficient method to split traffic into
chunks based off TCP and UDP flows. The hashing algorithm allows
the FIB to forward traffic to multiple next hops at line-rate without the
overhead of a state table. The caveat is that when the number of FIB
next hops changes for a given prefix, so does the algorithm to calculate
the next hop.
64 Day One: Scaling Beyond a Single Juniper SRX in the Data Center
During an EX8200 FIB change the real impact of stateful traffic – ac-
cording to the data above – is as follows:
As the next hop for SRX-1 was being removed from the 8200
FIB, there was 5ms of traffic loss for this particular next hop.
Nineteen TCP sockets were timed out and closed.
Based off the test results and behavior during a failure scenario, it’s
recommended that ECMP be used with short lived sessions that are
able to reestablish a new session in the event of a failure.
Chapter 4
Filter-Based Forwarding
A Different Approach. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
FBF Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
FBF Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
66 Day One: Scaling Beyond a Single Juniper SRX in the Data Center
If the effect of a FIB change using ECMP causes too much of an impact
in your network, the alternative is to reduce the size of the failure
domain; this can be accomplished with Filter-Based Forwarding (FBF).
FBF is a method to match traffic with firewall rules and push the traffic
into a different routing instance. This is similar to policy-based routing
(PBR), but FBF offers a huge advantage because it’s able to reset the
entire routing instance and not just the next hop.
A Different Approach
Traditional PBR will match traffic and simply change the next hop.
This works well enough until the said next hop doesn’t exist due to a
failure in the network. PBR is a very rigid method to move a subset of
traffic to a specific next hop. It provides nothing more and nothing less,
as you see in Figure 4.1.
!
Figure 4.1 Policy-Based Routing
PBR is too rigid to handle failure scenarios. Figure 4.1 illustrates that a
prefix is simply mapped to a next hop, and if that next hop were to
become unreachable, the PBR for Packet 1 would simply discard the
traffic.
FBF has the advantage of moving the traffic into a completely different
routing instance. Imagine that the routing instance SRX.inet.0, shown
in Figure 4.2, is running a dynamic routing protocol or has multiple
static default routes and is able to recover from a simple next hop
failure, whereas traditional PBR could not.
BFD could detect the loss of forwarding, and any traffic entering the
routing instance SRX.inet.0 would be forwarded to next-hop 2
instead.
In Figure 4.3, the first term would match for 192.168.1/26 and move
the traffic into the SRX-1.inet.0 routing instance and continue all the
way through 192.168.1.192/26, which would move the traffic into the
SRX-4.inet.0 routing instance.
This method increases the number of failure domains from one to four,
which results in containing failures to where they happened. In Figure
4.3 each SRX represents a single failure domain for a total of four. The
more failure domains, the better, as other parts of the network can
operate without being impacted.
For example, if SRX-1 had a failure, the traffic being matched in Term
1 would continue to be mapped to the SRX-1.inet.0 routing instance,
68 Day One: Scaling Beyond a Single Juniper SRX in the Data Center
but a floating default route would intercept the traffic and point it to
SRX-2. During such a failure scenario, traffic being matched by Term
2, Term 3, and Term 4 would be unaffected by the failure of SRX-1.
A shown in Figure 4.4 the decision point revolves around stability
versus performance. The more next hops that are in ECMP the better
the performance and uniform distribution, however, the fewer next
hops that are in ECMP offer better stability during a failure scenario.
The decision choices look like Figure 4.4.
Because each routing instance has its own instance of a FIB, it’s
possible to implement an architecture where both performance and
stability can co-exist. In Figure 4.5, both Term 1 and Term 2 use
ECMP to both SRX-1 and SRX-2. This offers more uniform distribu-
tion and performance, but the caveat is that they share the same failure
Chapter 4: Filter-Based Forwarding 69
FBF Configuration
The configuration of FBF is straightforward. The only difficulty is
deciding how to break up the traffic into different failure domains and
how to respond during a failure.
In the test bed, the IXIA Client is configured to use source addresses of
192.168.1.2 through 192.168.1.252. The most logical way to segment
this traffic is by matching on the four /26s within 192.168.1/24. Let’s
take a look at such a firewall filter:
firewall {
family inet {
filter distribute-default {
term SRX-1 {
from {
source-address {
192.168.1.0/26;
}
}
then {
routing-instance SRX-1;
}
}
term SRX-2 {
from {
source-address {
192.168.1.64/26;
}
}
then {
routing-instance SRX-2;
}
}
term SRX-3 {
from {
source-address {
192.168.1.128/26;
}
}
then {
routing-instance SRX-3;
}
}
term SRX-4 {
70 Day One: Scaling Beyond a Single Juniper SRX in the Data Center
from {
source-address {
192.168.1.192/26;
}
}
then {
routing-instance SRX-4;
}
}
}
}
}
This firewall filter effectively breaks out each /26 network into its own
routing instance. Breaking out the traffic into different routing instanc-
es is easy enough, but how do you create a routing instance and define
the next hops?
The first step is to create the routing instances for each Juniper SRX in
the topology: SRX-1, SRX-2, SRX-3, and SRX-4.
routing-instances {
SRX-1 {
instance-type virtual-router;
routing-options {
static {
route 0.0.0.0/0 {
qualified-next-hop 10.2.0.11;
qualified-next-hop 10.2.0.12 {
metric 6;
}
}
}
}
}
}
This routing instance isn’t enough, however, because the static route
0/0 has two next hops of 10.2.0.11 and 10.2.0.12, which are reachable
via vlan.200. The problem is that vlan.200 isn’t reachable inside of the
routing instance SRX-1.inet.0. In order for vlan.200 to be reachable
from the master routing instance inet.0 and SRX-1.inet.0, a RIB group
needs to be created:
routing-options {
interface-routes {
rib-group inet SRX;
}
rib-groups {
SRX {
import-rib [ inet.0 SRX-1.inet.0 SRX-2.inet.0 SRX-
Chapter 4: Filter-Based Forwarding 71
3.inet.0 SRX-4.inet.0 ];
}
}
}
SRX-1.inet.0: 6 destinations, 6 routes (6 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both
0.0.0.0/0 *[Static/5] 06:41:39
> to 10.2.0.11 via vlan.200
[Static/6] 06:41:39
to 10.2.0.12 via vlan.200
The next step is to ensure that failover is sub-second. Recall that the
test bed topology has a simple Layer 2 switch between the EX8200 and
Juniper SRX firewalls. If there was a link failure on SRX-1, it wouldn’t
be seen by EX8200. Another method is required to detect a data plane
failure. Chapter 2 introduced BFD as a method to detect data plane
failures on top of the IS-IS routing protocol. Since BFD is agnostic to
the client, Junos supports using BFD with static routes.
Within each routing instance BFD will be configured for each next hop
to provide sub second failover:
routing-instances {
SRX-1 {
instance-type virtual-router;
routing-options {
static {
route 0.0.0.0/0 {
qualified-next-hop 10.2.0.11 {
bfd-liveness-detection {
minimum-interval 300;
multiplier 3;
}
}
72 Day One: Scaling Beyond a Single Juniper SRX in the Data Center
qualified-next-hop 10.2.0.12 {
metric 6;
bfd-liveness-detection {
minimum-interval 300;
multiplier 3;
}
}
}
}
}
}
With both static routes riding on top of BFD, let’s verify with show bfd
sessions that BFD is up and acknowledges the static route as client:
jnpr@EX8208-SW1-RE0> show bfd session extensive
Detect Transmit
Address State Interface Time Interval Multiplier
10.2.0.11 Up vlan.200 0.900 0.300 3
Client Static, TX interval 0.300, RX interval 0.300
Client ISIS L2, TX interval 0.300, RX interval 0.300
Session up time 01:01:54, previous down time 05:23:44
Local diagnostic NbrSignal, remote diagnostic None
Remote state Up, version 1‑ Replicated
Min async interval 0.300, min slow interval 1.000
Adaptive async TX interval 0.300, RX interval 0.300
Local min TX interval 0.300, minimum RX interval 0.300, multiplier 3
Remote min TX interval 0.300, min RX interval 0.300, multiplier 3
Local discriminator 4, remote discriminator 9
Echo mode disabled/inactive
You can see that once the packet enters the new routing instance it’s
under the control of the RIB and FIB of that particular routing instance.
Each routing instance is mapped to two different firewalls. For example,
the routing instance SRX-1.inet.0 is mapped to both firewalls SRX-1
and SRX-2. In addition to being mapped to both firewalls, each routing
instance provides two default routes with different preferences. Each
routing instance has a preferred firewall as the default route for all
traffic. For example, all traffic entering routing instance SRX-1.inet.0
would use the next hop for SRX-1. In the event of a failure that caused
SRX-1 to become unreachable, BFD would detect the forwarding error
and signal the static route for SRX-1 to be removed, thus only leaving
the backup default route for SRX-2 available.
Let’s take a closer look at the life of a packet when flowing through this
new architecture, as shown in Figure 4.7.
74 Day One: Scaling Beyond a Single Juniper SRX in the Data Center
Figure 4.7 Flow Chart of a Packet using FBF and Primary / Secondary Default Routes
Figure 4.7 begins with the packet on the top left. It will be subject to a
firewall filter with four terms. Each term looks at the source address to
see if it matches a specific /26; if there’s no match it simply discards the
packet. Let’s assume that the source address of the packet is
192.168.1.67 with a destination address of 10.7.7.2. The second term
in the firewall filter matches this packet and pushes it into the SRX-2.
inet.0 routing instance. Once the packet is inside of the routing
instance it will need to choose a route. Inside of each routing instance
are two default routes. The first default route has a default preference
of 5, while the second default route has a preference of 6. Using this
method always guarantees that traffic will prefer the first default route
as it has a lower preference. Also keep in mind that each default route
Chapter 4: Filter-Based Forwarding 75
is a client to BFD and is monitoring each of the next hops. Let’s assume
there was a problem with SRX-1. BFD would detect the loss of hellos
and declare that the next hop to SRX-1 is down. The first default route
would be removed from the SRX-1.inet.0 route table and the only
remaining default route left would be pointing to SRX-2. Since the first
default route has been removed, the packet takes the second default
route with a preference of 6 and is forwarded to SRX-2.
The firewall filter, routing instances, and default routes have been
adjusted so that the traffic is split evenly across the 192.168.1/24
network into four different networks on the /26 boundary. Each /26
network has its own routing instance, default route, and firewall. This
configuration is a perfect example of tipping the scale towards stability,
as ECMP is not used.
FBF Testing
FBF introduces new options in how traffic is mapped to multiple
firewalls. Although FBF can be used to create a hybrid model of
performance and stability, it makes more sense to test the latter in this
chapter.
During a topology change it would be expected that only concurrent
flows in the specific routing instance would be impacted, while other
traffic in unrelated routing instances would continue forwarding traffic
without impact.
Key Value
Protocol TCP
Application HTTP
CPS 50,000
76 Day One: Scaling Beyond a Single Juniper SRX in the Data Center
Test Objective
When the test was started, the CPS ramped up to 50,000 within about
10 seconds. Each Juniper SRX firewall received about 12,500 CPS
each as the traffic was distributed based off each packet’s source
address across all four next hops. The EX8200 was configured with
FBF and per-prefix FIB load balancing, which results only in a single
next hop in the FIB at any given time.
Once again, the author shut down the interface ae0 on SRX-1 around
160 seconds. The traffic CPS dropped to about 42,000 for about five
seconds and quickly recovered as shown in Figure 4.8.
During this failure scenario it was noted that BFD detected the change
and removed the primary default route from the SRX-1.inet.0 RIB,
and traffic with a source address of 192.168.1/26 and 192.168.1.64/26
was mapped to the firewall SRX-2. This meant that the firewall SRX-2
was doing double duty for the remaining duration of the test since the
interface ae0 on SRX-1 remained disabled.
Let’s review the cumulative report for the FBF in Figure 4.9 to deter-
mine the real impact to the network.
The concurrent HTTP connections that were connected to SRX-1 were
timed out during the failure scenario. There were 135 IXIA Client TCP
retries because the traffic for these particular sockets was mapped to
the SRX-2. The SRX-2 had no knowledge of the active sessions on
SRX-1 and simply discarded the packets. Because the IXIA Client
wasn’t receiving TCP ACKs, it attempted to retry until it timed out the
connection. Once the connection was timed out, the IXIA Client sent a
TCP RST packet to the IXIA Server to kill the connection.
Chapter 4: Filter-Based Forwarding 77
The benefit of using FBF was that the SRX-1 failure scenario was
localized to only the SRX-1.inet.0 routing instance. Traffic flowing
through the other three routing instances wasn’t impacted and contin-
ued to forward traffic normally.
Proof of Concept
Junos FIB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
Traffic Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Firewall Clustering. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
80 Day One: Scaling Beyond a Single Juniper SRX in the Data Center
Routing table: __master.anon__.inet
Internet:
Destination Type RtRef Next hop Type Index NhRef Netif
default perm 0 rjct 1285 1
0.0.0.0/32 perm 0 dscd 1283 1
Routing table: SRX-1.inet
Internet:
Destination Type RtRef Next hop Type Index NhRef Netif
default user 0 0:1f:12:f2:7f:c0 ucst 1332 8 vlan.200
default perm 0 rjct 1372 1
0.0.0.0/32 perm 0 dscd 1370 1
Routing table: SRX-2.inet
Internet:
Destination Type RtRef Next hop Type Index NhRef Netif
default user 0 0:1f:12:f1:ff:c0 ucst 1334 8 vlan.200
default perm 0 rjct 1381 1
0.0.0.0/32 perm 0 dscd 1379 1
Routing table: SRX-3.inet
Internet:
Destination Type RtRef Next hop Type Index NhRef Netif
default user 0 0:1f:12:f6:ef:c0 ucst 1338 7 vlan.200
default perm 0 rjct 1390 1
0.0.0.0/32 perm 0 dscd 1388 1
Chapter 5: Proof of Concept 81
Routing table: SRX-4.inet
Internet:
Destination Type RtRef Next hop Type Index NhRef Netif
default user 0 0:1f:12:fa:f:c0 ucst 1324 7 vlan.200
default perm 0 rjct 1399 1
0.0.0.0/32 perm 0 dscd 1397 1
The key here is to match on the routing instance and apply a separate
action for each instance. In this example, the routing instances SRX-1
through SRX-4 have an action of then accept. This action will simply
apply the default FIB policy, which is per-prefix, forcing the FIB to
have a single next hop. The last term in the policy-statement is match-
ing the master routing instance with from instance master – the action
for this term is load-balance per-packet, which forces the FIB to
perform per-flow hashing, which allows multiple next hops.
82 Day One: Scaling Beyond a Single Juniper SRX in the Data Center
Traffic Distribution
One of the key components to using the FBF architecture is the
creation and maintenance of a firewall filter that’s used to match traffic
and place the packet into a specific routing instance. The method
demonstrated during the test was to simply match on the packet’s
source address and network mask. This approach is very coarse and
makes the assumption that the traffic conforms to variable length
subnet masking (VLSM) and the networks are sequential.
An alternative method that’s better equipped to deal with any range of
source addresses, is to match on the actual bits of the source IP ad-
dress. The only caveat is that this alternative method requires that the
number of next hops be subject to 2n.
If there were four next hops, a good method would be to match on the
last two bits of the source IP address. The four values would be 00, 01,
10, and 11. Junos firewall filters support noncontiguous address
matching. Let’s take a look at how such a firewall filter would be
created:
firewall {
family ethernet-switching {
filter count-dist {
term 1 {
from {
source-address {
0.0.0.0/0.0.0.3;
}
}
then count c1;
}
term 2 {
from {
source-address {
0.0.0.1/0.0.0.3;
}
}
then count c2;
}
term 3 {
from {
source-address {
0.0.0.2/0.0.0.3;
}
}
then count c3;
}
term 4 {
Chapter 5: Proof of Concept 83
from {
source-address {
0.0.0.3/0.0.0.3;
}
}
then count c4;
}
term else {
then {
discard;
log;
count c-else;
}
}
}
}
}
0.0.0.0 0.0.0.3 00
0.0.0.1 0.0.0.3 01
0.0.0.2 0.0.0.3 10
0.0.0.3 0.0.0.3 11
The huge benefit to using this method is that it will match any source
address because the first 30 bits are irrelevant. For example,
0.0.0.0/0.0.0.3 would match 10.100.4.4 and 192.168.6.188 because
the last two bits of the IP address end in binary 00. Let’s review the bit
positions of an IP packet:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Ver= 4 |IHL= 5 |Type of Service| Total Length = 21 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Identification = 111 |Flg=0| Fragment Offset = 0 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Time = 123 | Protocol = 1 | header checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| source address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| destination address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| data |
+-+-+-+-+-+-+-+-+
84 Day One: Scaling Beyond a Single Juniper SRX in the Data Center
2 4
3 8
4 16
5 32
If there were eight next hops, how would the firewall filter be modi-
fied? The last three bits of the IP address must be evaluated. Table 5.3
shows the address and prefix required to match the three bits of an
address.
Table 5.3 Matrix of Addresses, Prefixes, and the Last Three Bits Matched
NOTE Table 5.4 is a corner case example showing the flexibility of Junos
firewall filters to accept 0.0.3.0. It nearly all cases it’s better to use the
mask of 0.0.0.3.
Chapter 5: Proof of Concept 85
0.0.0.0 0.0.3.0 00
0.0.1.0 0.0.3.0 01
0.0.2.0 0.0.3.0 10
0.0.3.0 0.0.3.0 11
Assuming there’s a use case that has more diversity in the third octet of
the source IP address, Table 5.4 will serve as the reference for matching
the last two bits within the scope of the third octet, which would be
bits 22 and 23. In this example, 0.0.0.0/0.0.3.0 would match both
172.16.56.17 and 10.244.244.17, as the bits 22 and 23 are binary 00.
Firewall Clustering
Another decision point in the performance versus stability question is
the use of Juniper SRX clustering. When two firewalls are configured
to act as a single cluster, the benefit is that if one firewall has a failure,
the other firewall takes over without dropping any current sessions.
The drawback is that designing an architecture that leverages firewall
clusters requires twice the capital investment and decreases perfor-
mance roughly ten percent. For example, a standalone SRX5800 with
ten SPCs is able to provide roughly 47Gbps of firewall throughput
when using IMIX traffic. Firewall clustering would increase the capital
investment to two SRX5800 chassis and 20 SPCs and decrease the
performance to roughly 42Gbps of firewall throughput when IMIX
traffic is used. However the benefit is that the cluster is able to recover
from a failure without dropping sessions.
A good method to provide horizontal scaling with increased stability is
to use FBF combined with firewall clustering. This architecture would
increase the number of failure domains in the FIB so that a failure is
limited to the firewall from which it came. The firewall clustering also
decreases the likelihood of a FIB change, as the number of next hops in
the FIB wouldn’t change if a firewall node were to fail.
When considering the use of firewall clusters, the technical impact is
very small in terms of performance. The real question goes back to the
business. Is the extra investment of capital worth the extra stability? In
many cases the answer will be yes, but in other scenarios where
performance is most important or only a certain level of stability is
required, firewall clustering wouldn’t be necessary.
86 Day One: Scaling Beyond a Single Juniper SRX in the Data Center
Summary
Scaling traffic beyond a single Juniper SRX requires careful thought
and consideration. The traffic characteristics need to be analyzed, as
they are the first contributing factors in the firewall performance. Is the
traffic:
High or low CPS?
High or low throughput?
Short-lived or long-lived sessions?
Does the traffic consume a lot of sessions?
How many packets per second?
The other limiting factor is what type of firewall services will be
applied to the traffic:
Firewall cluster?
Intrusion Detection and Prevention (IDP)?
IPSec?
AppSecure?
Once the traffic characteristics have been identified, the next step is to
understand the business requirements of the traffic. Which traffic
requires performance and which traffic requires stability? Once these
two questions have been answered, the technical implementation is
trivial.
Once the baseline configuration and topology have been created, the
firewalls need to be monitored via SNMP to gather data such as
throughput, SPU CPU usage, and other metrics. Collecting this data
and storing it over time is critical. Being able to view the performance
characteristics of the firewalls over time provides the operational staff
with a method to plan and predict for future upgrades. The general
rule of thumb is that when the firewalls reach over 50% SPU usage, it’s
time to add additional firewalls to add more capacity into the pool. It’s
also possible that a particular firewall in the pool could be handling
more traffic than its peers, which requires that the firewall filter be
adjusted to move traffic over to a firewall with more capacity.
The following Appendix contains all the device configurations used in
this proof of concept.
Appendix
Device Configurations
MX240. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
EX4500-VC. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
SRX-1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
SRX-2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
SRX-3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
SRX-4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
EX8200-ECMP. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
EX8200-FBF. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
88 Day One: Scaling Beyond a Single Juniper SRX in the Data Center
MX240
chassis {
aggregated-devices {
ethernet {
device-count 1;
}
}
}
interfaces {
xe-0/0/0 {
gigether-options {
802.3ad ae0;
}
}
xe-0/1/0 {
gigether-options {
802.3ad ae0;
}
}
xe-0/2/0 {
encapsulation ethernet-bridge;
unit 0 {
family bridge;
}
}
xe-0/3/0 {
encapsulation ethernet-bridge;
unit 0 {
family bridge;
}
}
ae0 {
aggregated-ether-options {
minimum-links 1;
link-speed 10g;
lacp {
active;
}
}
unit 0 {
family inet {
address 10.3.0.1/24;
}
family iso;
}
}
irb {
unit 100 {
family inet {
address 10.7.7.1/24;
}
Appendix: Device Configurations 89
}
}
lo0 {
unit 0 {
family inet {
address 10.3.255.10/32;
}
family iso {
address 49.0000.0010.0003.0255.0010.00;
}
}
}
}
routing-options {
static {
route 0.0.0.0/0 discard;
route 20.20.31.0/24 next-hop 10.3.0.11;
route 20.20.32.0/24 next-hop 10.3.0.12;
route 20.20.33.0/24 next-hop 10.3.0.13;
route 20.20.34.0/24 next-hop 10.3.0.14;
}
autonomous-system 4567;
}
protocols {
bgp {
group SRX {
type internal;
local-address 10.3.255.10;
export export-default;
multipath;
neighbor 10.3.255.11;
neighbor 10.3.255.12;
neighbor 10.3.255.13;
neighbor 10.3.255.14;
}
group IXIA {
type external;
peer-as 1234;
neighbor 10.7.7.2;
}
}
isis {
level 1 disable;
interface ae0.0 {
bfd-liveness-detection {
minimum-interval 300;
multiplier 3;
}
level 2 priority 127;
level 1 disable;
}
interface lo0.0 {
passive;
}
90 Day One: Scaling Beyond a Single Juniper SRX in the Data Center
}
}
policy-options {
policy-statement export-default {
term 1 {
from {
protocol static;
route-filter 0.0.0.0/0 exact;
}
then {
next-hop self;
accept;
}
}
term 2 {
then reject;
}
}
bridge-domains {
IXIA {
vlan-id 100;
interface xe-0/2/0.0;
interface xe-0/3/0.0;
routing-interface irb.100;
}
}
EX4500-VC
chassis {
aggregated-devices {
ethernet {
device-count 10;
}
}
}
interfaces {
xe-0/0/0 {
ether-options {
802.3ad ae0;
}
}
xe-0/0/1 {
ether-options {
802.3ad ae1;
}
}
xe-0/0/2 {
ether-options {
Appendix: Device Configurations 91
802.3ad ae2;
}
}
xe-0/0/3 {
ether-options {
802.3ad ae3;
}
}
xe-0/0/4 {
ether-options {
802.3ad ae4;
}
}
xe-0/0/5 {
ether-options {
802.3ad ae5;
}
}
xe-1/0/0 {
ether-options {
802.3ad ae0;
}
}
xe-1/0/1 {
ether-options {
802.3ad ae1;
}
}
xe-1/0/2 {
ether-options {
802.3ad ae2;
}
}
xe-1/0/3 {
ether-options {
802.3ad ae3;
}
}
xe-1/0/4 {
ether-options {
802.3ad ae4;
}
}
xe-1/0/5 {
ether-options {
802.3ad ae5;
}
}
ae0 {
description SRX-1;
aggregated-ether-options {
lacp {
active;
periodic fast;
92 Day One: Scaling Beyond a Single Juniper SRX in the Data Center
}
}
unit 0 {
family ethernet-switching {
port-mode trunk;
vlan {
members [ TRUST UNTRUST ];
}
}
}
}
ae1 {
description SRX-2;
aggregated-ether-options {
lacp {
active;
periodic fast;
}
}
unit 0 {
family ethernet-switching {
port-mode trunk;
vlan {
members [ TRUST UNTRUST ];
}
}
}
}
ae2 {
description SRX-3;
aggregated-ether-options {
lacp {
active;
periodic fast;
}
}
unit 0 {
family ethernet-switching {
port-mode trunk;
vlan {
members [ TRUST UNTRUST ];
}
}
}
}
ae3 {
description SRX-4;
aggregated-ether-options {
lacp {
active;
periodic fast;
}
}
unit 0 {
Appendix: Device Configurations 93
family ethernet-switching {
port-mode trunk;
vlan {
members [ TRUST UNTRUST ];
}
}
}
}
ae4 {
description MX240;
aggregated-ether-options {
lacp {
active;
periodic fast;
}
}
unit 0 {
family ethernet-switching {
port-mode access;
vlan {
members UNTRUST;
}
}
}
}
ae5 {
aggregated-ether-options {
lacp {
active;
periodic fast;
}
}
unit 0 {
family ethernet-switching {
port-mode trunk;
vlan {
members [ TRUST UNTRUST ];
}
}
}
}
vlan {
unit 200 {
family inet {
address 10.2.0.254/24;
}
}
unit 300 {
family inet {
address 10.3.0.254/24;
}
}
}
}
94 Day One: Scaling Beyond a Single Juniper SRX in the Data Center
vlans {
TRUST {
vlan-id 200;
l3-interface vlan.200;
}
UNTRUST {
vlan-id 300;
l3-interface vlan.300;
}
}
virtual-chassis {
preprovisioned;
no-split-detection;
member 0 {
role routing-engine;
serial-number XX;
}
member 1 {
role routing-engine;
serial-number XX;
}
}
SRX-1
chassis {
aggregated-devices {
ethernet {
device-count 10;
}
}
}
interfaces {
xe-10/2/0 {
gigether-options {
802.3ad ae0;
}
}
xe-10/3/0 {
gigether-options {
802.3ad ae0;
}
}
ae0 {
disable;
vlan-tagging;
aggregated-ether-options {
lacp {
periodic fast;
}
}
unit 200 {
Appendix: Device Configurations 95
vlan-id 200;
family inet {
address 10.2.0.11/24;
}
family iso;
}
unit 300 {
vlan-id 300;
family inet {
address 10.3.0.11/24;
}
family iso;
}
}
lo0 {
unit 0 {
family inet {
address 10.3.255.11/32;
}
family iso {
address 49.0000.1111.1111.1111.00;
}
}
}
}
routing-options {
autonomous-system 4567;
}
protocols {
bgp {
group UNTRUST {
type internal;
local-address 10.3.255.11;
neighbor 10.3.255.10;
}
group TRUST {
type external;
multihop;
local-address 10.3.255.11;
peer-as 1234;
neighbor 10.2.255.10;
}
}
isis {
apply-groups isis-bfd;
level 1 disable;
interface ae0.200;
interface ae0.300;
interface lo0.0 {
passive;
}
}
}
security {
96 Day One: Scaling Beyond a Single Juniper SRX in the Data Center
address-book {
global {
address 0/0 0.0.0.0/0;
address 192.168/16 192.168.0.0/16;
address 20/8 1.0.0.0/24;
address 1/24 1.0.0.0/24;
address 20.20/16 20.20.0.0/16;
}
}
nat {
source {
pool test-snat {
address {
20.20.31.0/24;
}
}
rule-set rs1 {
from zone TRUST;
to zone UNTRUST;
rule r1 {
match {
source-address 192.168.0.0/16;
}
then {
source-nat {
pool {
test-snat;
}
}
}
}
}
}
}
policies {
from-zone TRUST to-zone UNTRUST {
policy permit-all {
match {
source-address any;
destination-address any;
application any;
}
then {
permit;
count;
}
}
}
}
zones {
security-zone TRUST {
tcp-rst;
host-inbound-traffic {
system-services {
Appendix: Device Configurations 97
ping;
}
protocols {
bgp;
bfd;
}
}
interfaces {
ae0.200;
}
}
security-zone UNTRUST {
tcp-rst;
host-inbound-traffic {
system-services {
ping;
}
protocols {
bgp;
bfd;
}
}
interfaces {
ae0.300;
}
}
}
}
SRX-2
chassis {
aggregated-devices {
ethernet {
device-count 10;
}
}
}
interfaces {
xe-10/2/0 {
gigether-options {
802.3ad ae0;
}
}
xe-10/3/0 {
gigether-options {
802.3ad ae0;
}
}
ae0 {
vlan-tagging;
aggregated-ether-options {
98 Day One: Scaling Beyond a Single Juniper SRX in the Data Center
lacp {
periodic fast;
}
}
unit 200 {
vlan-id 200;
family inet {
address 10.2.0.12/24;
}
family iso;
}
unit 300 {
vlan-id 300;
family inet {
address 10.3.0.12/24;
}
family iso;
}
}
lo0 {
unit 0 {
family inet {
address 10.3.255.12/32;
}
family iso {
address 49.0000.2222.2222.2222.00;
}
}
}
}
routing-options {
autonomous-system 4567;
}
protocols {
bgp {
group UNTRUST {
type internal;
local-address 10.3.255.12;
neighbor 10.3.255.10;
}
group TRUST {
type external;
multihop;
local-address 10.3.255.12;
peer-as 1234;
neighbor 10.2.255.10;
}
}
isis {
apply-groups isis-bfd;
level 1 disable;
interface ae0.200;
interface ae0.300;
interface lo0.0 {
Appendix: Device Configurations 99
passive;
}
}
}
security {
address-book {
global {
address 0/0 0.0.0.0/0;
address 192.168/16 192.168.0.0/16;
address 20/8 1.0.0.0/24;
address 1/24 1.0.0.0/24;
address 20.20/16 20.20.0.0/16;
}
}
nat {
source {
pool test-snat {
address {
20.20.32.0/24;
}
}
rule-set rs1 {
from zone TRUST;
to zone UNTRUST;
rule r1 {
match {
source-address 192.168.0.0/16;
}
then {
source-nat {
pool {
test-snat;
}
}
}
}
}
}
}
policies {
from-zone TRUST to-zone UNTRUST {
policy permit-all {
match {
source-address any;
destination-address any;
application any;
}
then {
permit;
count;
}
}
}
}
100 Day One: Scaling Beyond a Single Juniper SRX in the Data Center
zones {
security-zone TRUST {
tcp-rst;
host-inbound-traffic {
system-services {
ping;
}
protocols {
bgp;
bfd;
}
}
interfaces {
ae0.200;
}
}
security-zone UNTRUST {
tcp-rst;
host-inbound-traffic {
system-services {
ping;
}
protocols {
bgp;
bfd;
}
}
interfaces {
ae0.300;
}
}
}
}
SRX-3
chassis {
aggregated-devices {
ethernet {
device-count 10;
}
}
}
interfaces {
xe-10/2/0 {
gigether-options {
802.3ad ae0;
}
}
xe-10/3/0 {
Appendix: Device Configurations 101
gigether-options {
802.3ad ae0;
}
}
ae0 {
vlan-tagging;
aggregated-ether-options {
lacp {
periodic fast;
}
}
unit 200 {
vlan-id 200;
family inet {
address 10.2.0.13/24;
}
family iso;
}
unit 300 {
vlan-id 300;
family inet {
address 10.3.0.13/24;
}
family iso;
}
}
lo0 {
unit 0 {
family inet {
address 10.3.255.13/32;
}
family iso {
address 49.0000.3333.3333.3333.00;
}
}
}
}
routing-options {
autonomous-system 4567;
}
protocols {
bgp {
group TRUST {
type internal;
local-address 10.3.255.13;
neighbor 10.3.255.10;
}
group SRX {
type external;
multihop;
local-address 10.3.255.13;
peer-as 1234;
neighbor 10.2.255.10;
}
102 Day One: Scaling Beyond a Single Juniper SRX in the Data Center
}
isis {
apply-groups isis-bfd;
level 1 disable;
interface ae0.200;
interface ae0.300;
interface lo0.0 {
passive;
}
}
}
security {
address-book {
global {
address 0/0 0.0.0.0/0;
address 192.168/16 192.168.0.0/16;
address 20/8 1.0.0.0/24;
address 1/24 1.0.0.0/24;
address 20.20/16 20.20.0.0/16;
}
}
nat {
source {
pool test-snat {
address {
20.20.33.0/24;
}
}
rule-set rs1 {
from zone TRUST;
to zone UNTRUST;
rule r1 {
match {
source-address 192.168.0.0/16;
}
then {
source-nat {
pool {
test-snat;
}
}
}
}
}
}
}
policies {
from-zone TRUST to-zone UNTRUST {
policy permit-all {
match {
source-address any;
destination-address any;
application any;
}
Appendix: Device Configurations 103
then {
permit;
count;
}
}
}
}
zones {
security-zone TRUST {
tcp-rst;
host-inbound-traffic {
system-services {
ping;
}
protocols {
bgp;
bfd;
}
}
interfaces {
ae0.200;
}
}
security-zone UNTRUST {
tcp-rst;
host-inbound-traffic {
system-services {
ping;
}
protocols {
bgp;
bfd;
}
}
interfaces {
ae0.300;
}
}
}
}
SRX-4
chassis {
aggregated-devices {
ethernet {
device-count 10;
}
}
}
104 Day One: Scaling Beyond a Single Juniper SRX in the Data Center
interfaces {
xe-10/2/0 {
gigether-options {
802.3ad ae0;
}
}
xe-10/3/0 {
gigether-options {
802.3ad ae0;
}
}
ae0 {
vlan-tagging;
aggregated-ether-options {
lacp {
periodic fast;
}
}
unit 200 {
vlan-id 200;
family inet {
address 10.2.0.14/24;
}
family iso;
}
unit 300 {
vlan-id 300;
family inet {
address 10.3.0.14/24;
}
family iso;
}
}
lo0 {
unit 0 {
family inet {
address 10.3.255.14/32;
}
family iso {
address 49.0000.4444.4444.4444.00;
}
}
}
}
routing-options {
autonomous-system 4567;
}
protocols {
bgp {
group UNTRUST {
type internal;
local-address 10.3.255.14;
neighbor 10.3.255.10;
}
Appendix: Device Configurations 105
group TRUST {
type external;
multihop;
local-address 10.3.255.14;
peer-as 1234;
neighbor 10.2.255.10;
}
}
isis {
apply-groups isis-bfd;
level 1 disable;
interface ae0.200;
interface ae0.300;
interface lo0.0 {
passive;
}
}
}
security {
address-book {
global {
address 0/0 0.0.0.0/0;
address 192.168/16 192.168.0.0/16;
address 20/8 1.0.0.0/24;
address 1/24 1.0.0.0/24;
address 20.20/16 20.20.0.0/16;
}
}
nat {
source {
pool test-snat {
address {
20.20.34.0/24;
}
}
rule-set rs1 {
from zone TRUST;
to zone UNTRUST;
rule r1 {
match {
source-address 192.168.0.0/16;
}
then {
source-nat {
pool {
test-snat;
}
}
}
}
}
}
}
policies {
106 Day One: Scaling Beyond a Single Juniper SRX in the Data Center
EX8200-ECMP
chassis {
aggregated-devices {
ethernet {
device-count 10;
}
}
}
interfaces {
xe-0/0/0 {
ether-options {
802.3ad ae0;
}
}
xe-0/0/1 {
ether-options {
802.3ad ae0;
}
}
xe-0/0/2 {
unit 0 {
family ethernet-switching {
port-mode access;
vlan {
members DC;
}
}
}
}
xe-0/0/3 {
unit 0 {
family ethernet-switching {
port-mode access;
vlan {
members DC;
}
}
}
}
ae0 {
aggregated-ether-options {
lacp {
active;
periodic fast;
}
}
unit 0 {
family ethernet-switching {
port-mode trunk;
vlan {
members TRUST;
}
108 Day One: Scaling Beyond a Single Juniper SRX in the Data Center
}
}
}
lo0 {
unit 0 {
family inet {
address 10.2.255.10/32;
}
family iso {
address 49.0000.7777.7777.7777.00;
}
}
}
vlan {
unit 200 {
family inet {
address 10.2.0.253/24;
}
family iso;
}
unit 1000 {
family inet {
address 192.168.1.1/24;
}
family iso;
}
}
}
routing-options {
autonomous-system 1234;
forwarding-table {
export lb;
}
}
protocols {
bgp {
group SRX {
type external;
multihop;
local-address 10.2.255.10;
peer-as 4567;
multipath;
neighbor 10.3.255.11;
neighbor 10.3.255.12;
neighbor 10.3.255.13;
neighbor 10.3.255.14;
}
}
isis {
level 1 disable;
interface lo0.0 {
passive;
}
interface vlan.200 {
Appendix: Device Configurations 109
bfd-liveness-detection {
minimum-interval 300;
multiplier 3;
}
}
interface vlan.1000;
}
}
policy-options {
policy-statement lb {
then {
load-balance per-packet;
}
}
}
vlans {
DC {
vlan-id 1000;
l3-interface vlan.1000;
}
TRUST {
vlan-id 200;
l3-interface vlan.200;
}
}
EX8200-FBF
chassis {
aggregated-devices {
ethernet {
device-count 10;
}
}
}
interfaces {
xe-0/0/0 {
ether-options {
802.3ad ae0;
}
}
xe-0/0/1 {
ether-options {
802.3ad ae0;
}
}
xe-0/0/2 {
unit 0 {
family ethernet-switching {
port-mode access;
110 Day One: Scaling Beyond a Single Juniper SRX in the Data Center
vlan {
members DC;
}
}
}
}
xe-0/0/3 {
unit 0 {
family ethernet-switching {
port-mode access;
vlan {
members DC;
}
}
}
}
ae0 {
aggregated-ether-options {
lacp {
active;
periodic fast;
}
}
unit 0 {
family ethernet-switching {
port-mode trunk;
vlan {
members TRUST;
}
}
}
}
lo0 {
unit 0 {
family inet {
address 10.2.255.10/32;
}
family iso {
address 49.0000.7777.7777.7777.00;
}
}
}
vlan {
unit 200 {
family inet {
address 10.2.0.253/24;
}
family iso;
}
unit 1000 {
family inet {
filter {
input distribute-default;
}
Appendix: Device Configurations 111
address 192.168.1.1/24;
}
family iso;
}
}
}
routing-options {
interface-routes {
rib-group inet SRX;
}
rib-groups {
SRX {
import-rib [ inet.0 SRX-1.inet.0 SRX-2.inet.0 SRX-3.
inet.0 SRX-4.inet.0 ];
}
}
autonomous-system 1234;
forwarding-table {
export lb;
}
}
protocols {
bgp {
group SRX {
type external;
multihop;
local-address 10.2.255.10;
peer-as 4567;
multipath;
neighbor 10.3.255.11;
neighbor 10.3.255.12;
neighbor 10.3.255.13;
neighbor 10.3.255.14;
}
}
isis {
level 1 disable;
interface lo0.0 {
passive;
}
interface vlan.200 {
bfd-liveness-detection {
minimum-interval 300;
multiplier 3;
}
}
interface vlan.1000;
}
}
policy-options {
policy-statement lb {
term SRX-1 {
from instance SRX-1;
then accept;
112 Day One: Scaling Beyond a Single Juniper SRX in the Data Center
}
term SRX-2 {
from instance SRX-2;
then accept;
}
term SRX-3 {
from instance SRX-3;
then accept;
}
term SRX-4 {
from instance SRX-4;
then accept;
}
term master {
from instance master;
then {
load-balance per-packet;
}
}
}
}
firewall {
family inet {
filter distribute-default {
term SRX-1 {
from {
source-address {
192.168.1.0/26;
}
}
then {
count SRX-1;
routing-instance SRX-1;
}
}
term SRX-2 {
from {
source-address {
192.168.1.64/26;
}
}
then {
count SRX-2;
routing-instance SRX-2;
}
}
term SRX-3 {
from {
source-address {
192.168.1.128/26;
}
}
then {
count SRX-3;
Appendix: Device Configurations 113
routing-instance SRX-3;
}
}
term SRX-4 {
from {
source-address {
192.168.1.192/26;
}
}
then {
count SRX-4;
routing-instance SRX-4;
}
}
term else {
then {
count SRX-discard;
discard;
}
}
}
}
}
routing-instances {
SRX-1 {
instance-type virtual-router;
routing-options {
static {
route 0.0.0.0/0 {
qualified-next-hop 10.2.0.11 {
bfd-liveness-detection {
minimum-interval 300;
multiplier 3;
}
}
qualified-next-hop 10.2.0.12 {
metric 6;
bfd-liveness-detection {
minimum-interval 300;
multiplier 3;
}
}
}
}
}
}
SRX-2 {
instance-type virtual-router;
routing-options {
static {
route 0.0.0.0/0 {
qualified-next-hop 10.2.0.12 {
bfd-liveness-detection {
minimum-interval 300;
114 Day One: Scaling Beyond a Single Juniper SRX in the Data Center
multiplier 3;
}
}
qualified-next-hop 10.2.0.11 {
metric 6;
bfd-liveness-detection {
minimum-interval 300;
multiplier 3;
}
}
}
}
}
}
SRX-3 {
instance-type virtual-router;
routing-options {
static {
route 0.0.0.0/0 {
qualified-next-hop 10.2.0.13 {
bfd-liveness-detection {
minimum-interval 300;
multiplier 3;
}
}
qualified-next-hop 10.2.0.14 {
metric 6;
bfd-liveness-detection {
minimum-interval 300;
multiplier 3;
}
}
}
}
}
}
SRX-4 {
instance-type virtual-router;
routing-options {
static {
route 0.0.0.0/0 {
qualified-next-hop 10.2.0.14 {
bfd-liveness-detection {
minimum-interval 300;
multiplier 3;
}
}
qualified-next-hop 10.2.0.13 {
metric 6;
bfd-liveness-detection {
minimum-interval 300;
multiplier 3;
}
}
Appendix: Device Configurations 115
}
}
}
}
}
vlans {
DC {
vlan-id 1000;
l3-interface vlan.1000;
}
TRUST {
vlan-id 200;
l3-interface vlan.200;
}
}
116 Day One: Scaling Beyond a Single Juniper SRX in the Data Center
http://www.juniper.net/books
The books in the Juniper Networks Technical Library may assist you in
understanding and implementing network efficiency.
http://www.juniper.net/dayone
The Day One book series is available for free download in PDF
format. Select titles also feature a Copy and Paste edition for direct
placement of Junos configurations.
http://forums.juniper.net/jnet
The Juniper-sponsored J-Net Communities forum is dedicated to
sharing information, best practices, and questions about Juniper
products, technologies, and solutions. Register to participate in this
free forum.
http://www.juniper.net/techpubs/
Juniper Networks technical documentation includes everything you
need to understand and configure all aspects of Junos, including
MPLS. The documentation set is both comprehensive and thoroughly
reviewed by Juniper engineering.
http://www.juniper.net/training/fasttrack
Take courses online, on location, or at one of the partner training
centers around the world. The Juniper Network Technical Certifica-
tion Program (JNTCP) allows you to earn certifications by demon-
strating competence in configuration and troubleshooting of Juniper
products. If you want the fast track to earning your certifications in
enterprise routing, switching, or security use the available online
courses, student guides, and lab guides.