Professional Documents
Culture Documents
5 IP SLAs
Rick Graziani Cabrillo College graziani@cabrillo.edu Spring 2011
Redundancy
Geographic diversity and path diversity are often included.
Dual devices and links are common. Dual WAN providers are common. Dual data centers are sometimes used, especially for large companies and large e-commerce sites. Dual collocation facilities, dual phone central office facilities, and dual power substations can be implemented.
Technology
Cisco Nonstop Forwarding (NSF) Stateful Switchover (SSO) Graceful Restart Cisco IOS IP Service Level Agreements (SLA) Object Tracking Firewall Stateful Failover
People
Prepare, Plan, Design, Implement, Operate, and Optimize (PPDIOO) is a guide. Work habits and attention to detail important. Skills are acquired via ongoing technical training. Good communication and documentation critical. Use lab testing to simulate failover scenarios. Take time to design. Identify roles. Identify responsibilities. Align teams with services. Ensure time to do job.
Processes
Organizations should build repeatable processes. Organizations should use labs appropriately. Organizations need meaningful change controls. Management of operational changes is important.
Tools
Network diagrams. Documentation of network design evolution. Key addresses, VLANs, and servers documented. Documentation tying services to applications and physical servers.
Network-Level Resiliency
Built with device and link redundancy. Employs fast convergence. Relies on monitoring with NTP, SNMP, Syslog, and IP SLA.
Tuned routing protocols failover in less than 1 second. RSTP converges in about 1 second. EtherChannel can failover in approximately 1 second. HSRP timers are 3 seconds for hello and 10 seconds for hold time. Stateful service modules typically failover within 3-5 seconds. TCP/IP stacks have up to a 9-second tolerance.
Optimal Redundancy
Provide alternate paths. Avoid too much redundancy. Avoid single point of failure. Use Cisco NSF with SSO, if applicable. Use Cisco NSF with routing protocols.
Key element of high availability. Easy to implement at core and distribution. Access layer switch is single point of failure. Reduce outages to 1 to 3 seconds in the access layer with: SSO in L2 environment Cisco NSF with SSO in L3 environment.
Logging Services
Events on networking devices can be logged. Various events Various levels of severity Events are logged to: Console (default) Console display Buffer Server Examples Interfaces up or down Configuration changes Routing protocol adjacencies
19
Logging Services
Logging severity levels on Cisco Systems devices are as follows: (0) Emergencies (1) Alerts (2) Critical (3) Errors (4) Warnings (5) Notifications (6) Informational (7) Debugging By default, all messages from level 0 to 7 are logged to the console
20
Logging Services
Console
You can also adjust the logging severity level of the console. By default, all messages from level 0 to 7 are logged to the console; You can configure the severity level as an optional parameter: logging console level Limits the logging of messages displayed on the console terminal to the specified level and (numerically) lower levels. 21 You can enter the level number or level name.
Logging Services
Buffer logging buffered [buffer-size|level] May or may not be the default By default, messages of all severity levels are logged to buffer. show logging Displays the content of the buffer The buffer is circular, meaning that when the buffer has reached its maximum capacity, the oldest messages will be discarded to allow the logging of new messages.
22
Configuring Syslog
To configure logging to the buffer of the local switch, use the command logging buffered.
Switch(config)# logging buffered ? <0-7> Logging severity level <4096-2147483647> Logging buffer size alerts Immediate action needed (severity=1) critical Critical conditions (severity=2) debugging Debugging messages (severity=7) discriminator Establish MD-Buffer association emergencies System is unusable (severity=0) errors Error conditions (severity=3) informational Informational messages (severity=6) notifications Normal but significant conditions (severity=5) warnings Warning conditions (severity=4) xml Enable logging in XML to XML logging buffer
Logging Services
Server logging ip-address command Some IOS version it is logging host By default, only messages of severity level 6 or lower will be logged to the syslog server. This can be changed by entering the logging trap level command.
24
Configuring Syslog
To configure a syslog server, use the logging ip_addr global configuration command. To which severity levels of messages are sent to the syslog server, use the global configuration command logging trap level.
Switch(config)# logging trap ? <0-7> Logging severity level alerts Immediate action needed critical Critical conditions debugging Debugging messages emergencies System is unusable errors Error conditions informational Informational messages notifications Normal but significant conditions warnings Warning conditions
Informational
Debugging
Level 6
Level 7
Syslog Facilities
Service identifiers. Identify and categorize system state data for error and event message reporting. Cisco IOS has more than 500 facilities. Most common syslog facilities: IP OSPF SYS operating system IP Security (IPsec) Route Switch Processor (RSP) Interface (IF)
System messages begin with a percent sign (%) Facility: A code consisting of two or more uppercase letters that indicates the hardware device, protocol, or a module of the system software. Severity: A single-digit code from 0 to 7 that reflects the severity of the condition. The lower the number, the more serious the situation. Mnemonic: A code that uniquely identifies the error message. Message-text: A text string describing the condition. This portion of the message sometimes contains detailed information about the event, including terminal port numbers, network addresses, or addresses that correspond to locations in the system memory address space.
Switch# show logging | include LINK-3 2d20h: %LINK-3-UPDOWN: Interface FastEthernet0/1, changed state to up 2d20h: %LINK-3-UPDOWN: Interface FastEthernet0/2, changed state to up 2d20h: %LINK-3-UPDOWN: Interface FastEthernet0/1, changed state to up Switch# show logging | begin %DUAL 2d22h: %DUAL-5-NBRCHANGE: EIGRP-IPv4:(10) 10: Neighbor 10.1.253.13 (FastEthernet0/11) is down: interface down 2d22h: %LINK-3-UPDOWN: Interface FastEthernet0/11, changed state to down 2d22h: %LINEPROTO-5-UPDOWN: Line protocol on Interface FastEthernet0/11, changed state to down
Cisco IP SLA
IP SLA, feature of Cisco IOS software allows you to configure a router to send synthetic traffic to: A host computer Router that has been configured to respond (Responder)
31
IP SLA is very useful for: performance measurement monitoring network baselining. You can tie the results of the IP SLA operations to other features of your router and trigger action based on the results of the probe.
32
To implement IP SLA network performance measurement, you need to perform the following tasks: Enable the IP SLA responder, if required. Configure the required IP SLA operation type. Configure any options available for the specified operation type. Configure threshold conditions, if required. Schedule the operation to run, and then let the operation run for a period of time to gather statistics. Display and interpret the results of the operation using the Cisco IOS CLI or a network management system (NMS), with Simple Network Management Protocol (SNMP).
33
Depending on the type of probe you setup, you may or may not need to configure an IP SLA Responder. For example, if you are setting up a simple echo probe to a IP host, you do not need a responder. An IP SLA Responder allows for more detailed information to be retrieved.
34
IP SLA Responder is a component embedded in the destination Cisco routing device. Allows the system to anticipate and respond to IP SLA request packets Provides a large advantage with accurate measurements without the need for dedicated probes and additional statistics not available from standard ICMP-based measurements. See information regarding IP SLAs with Responder Time Stamps IP SLA Source (Cisco device) uses an IP SLA Control Protocol to communicate with the IP SLA Responder. Tells the responder which port it should listen to and respond. Responder will enable the specified UDP or TCP port for a specific duration.
35
fa0/1
172.16.1.1
Customer A is multihoming to two ISPs. Customer A is not using BGP with the ISPs; but using static default routes. Two default static routes with different administrative distances are configured Link to ISP-1 is the primary link Link to ISP-2 is the backup link The static default route with the lower administrative distance will be preferred and injected into the routing table. However, if there is a problem within the ISP-1 domain but its interface to Customer A is still up, all traffic from Customer A will still go to that ISP The traffic may then get lost within the ISP.
36
fa0/0
fa0/1
172.16.1.1
The solution to this issue is the Cisco IOS IP SLAs functionality Configure the SLAs to: Continuously check the reachability of a specific destination such as: Provider edge [PE] router interface ISP's DNS server Any other specific destination: 10.1.1.1 and 172.16.1.1 Conditionally announce the default route only if the connectivity is verified.
37
R1(config)# ip sla monitor 11 R1(config-rtr)# type echo protocol ipIcmpEcho 10.1.1.1 source-interface fa0/0 R1(config-rtr)# frequency 10 R1(config)# ip sla monitor schedule schedule 11 life forever start-time now R1(config)# track 1 rtr 11 reachability R1(config)# ip route 0.0.0.0 0.0.0.0 fa0/0 2 track 1
Probe
172.16.1.1 Defining the Probe ip sla: defines probe 11 type echo: specifies that the ICMP echoes are sent: To destination 10.1.1.1 to check connectivity With the source interface of FastEthernet0/0 frequency 10: schedules the connectivity test to repeat every 10 seconds. ip sla monitor schedule 11 life forever start-time now: defines the start time of now and it will continue forever
38
R1(config)# ip sla monitor 11 R1(config-rtr)# type echo protocol ipIcmpEcho 10.1.1.1 source-interface fa0/0 R1(config-rtr)# frequency 10 R1(config)# ip sla monitor schedule schedule 11 life forever start-time now R1(config)# track 1 rtr 11 reachability R1(config)# ip route 0.0.0.0 0.0.0.0 fa0/0 2 track 1
Probe
172.16.1.1 Defining the Tracking Object track 1 rtr 11 reachability: Specifies that: Object 1 is tracked (next step) Linked to probe 11 (defined in the first step) so that the reachability of the 10.1.1.1 is tracked.
39
R1(config)# ip sla monitor 11 R1(config-rtr)# type echo protocol ipIcmpEcho 10.1.1.1 source-interface fa0/0 R1(config-rtr)# frequency 10 R1(config)# ip sla monitor schedule schedule 11 life forever start-time now R1(config)# track 1 rtr 11 reachability R1(config)# ip route 0.0.0.0 0.0.0.0 fa0/0 2 track 1
Probe
AD=2
172.16.1.1
Defining an action based on the status of the tracking object ip route 0.0.0.0 0.0.0.0 fa0/0 2 track 1: Conditionally announces the default route, out fa0/0, with an administrative distance 2 if the result of tracking object 1 is true if the probe is successful. To summarize: If 10.1.1.1 is reachable, a static default route out Fa0/0 with an administrative distance of 2, is installed in the routing table.
40
R1(config)# ip sla monitor 22 R1(config-rtr)# type echo protocol ipIcmpEcho 172.16.1.1 source-interface fa0/1 R1(config-rtr)# frequency 10 R1(config)# ip sla monitor schedule 22 life forever start-time now R1(config)# track 2 rtr 22 reachability R1(config)# ip route 0.0.0.0 0.0.0.0 fa0/1 3 track 2
Probe
172.16.1.1 Defining the Probe ip sla: defines probe 22 type echo: specifies that the ICMP echoes are sent: To destination 172.16.1.1 to check connectivity, With the source interface of FastEthernet0/1 frequency 10: schedules the connectivity test to repeat every 10 seconds. ip sla monitor schedule 22 life forever start-time now: defines the start time of now and it will continue forever
41
R1(config)# ip sla monitor 22 R1(config-rtr)# type echo protocol ipIcmpEcho 172.16.1.1 source-interface fa0/1 R1(config-rtr)# frequency 10 R1(config)# ip sla monitor schedule 22 life forever start-time now R1(config)# track 2 rtr 22 reachability R1(config)# ip route 0.0.0.0 0.0.0.0 fa0/1 3 track 2
Probe
172.16.1.1 Defining the Tracking Object track 1 rtr 22 reachability: Specifies that: Object 2 is tracked (next step) Linked to probe 22 (defined in the first step) so that the reachability of the 172.16.1.1 is tracked.
42
R1(config)# ip sla monitor 22 R1(config-rtr)# type echo protocol ipIcmpEcho 172.16.1.1 source-interface fa0/1 R1(config-rtr)# frequency 10 R1(config)# ip sla monitor schedule 22 life forever start-time now R1(config)# track 2 rtr 22 reachability R1(config)# ip route 0.0.0.0 0.0.0.0 fa0/1 3 track 2
Probe
AD=2 AD=3
172.16.1.1
Defining an action based on the status of the tracking object ip route 0.0.0.0 0.0.0.0 fa 0/1 3 track 2: Conditionally announces the default route, exit fa0/1, with an administrative distance 3 if the result of tracking object 1 is true if the probe is successful. To summarize: If 172.16.1.1 is reachable, a static default route exit fa0/1 with an administrative distance of 3 is offered to the routing table. Because this default route has a higher AD of 3, if the path via R2 is available, this path will be the backup path.
43
R1(config)# ip sla monitor 11 R1(config-rtr)# type echo protocol ipIcmpEcho 10.1.1.1 source-interface fa0/0 R1(config-rtr)# frequency 10 R1(config)# ip sla monitor schedule 11 life forever start-time now R1(config)# track 1 rtr 11 reachability R1(config)# ip route 0.0.0.0 0.0.0.0 fa0/0 2 track 1 R1(config)# ip sla monitor 22 R1(config-rtr)# type echo protocol ipIcmpEcho 172.16.1.1 source-interface fa0/1 R1(config-rtr)# frequency 10 R1(config)# ip sla monitor schedule 22 life forever start-time now R1(config)# track 2 rtr 22 reachability
Probe
Probe
Tracking Object
If 10.1.1.1 is reachable, a static default route via R2 with an administrative distance of 2, is installed in the routing table If 172.16.1.1 is reachable, a static default route via R3 with an administrative distance of 3 is available to the routing table as a backup path.
AD=2 AD=3
172.16.1.1
44
RouterB(config)# ip sla monitor 11 RouterB(config-rtr)# type dns target-addr www.cisco.com name-server 172.20.2.132 RouterB(config-rtr)# frequency 60 RouterB(config-rtr)# exit RouterB(config)# ip sla monitor schedule 11 life forever start-time now
To measure the difference between the time taken to send a DNS request and the time a reply is received by a Cisco device, use the IP SLAs DNS operation. Configuration of an IP SLAs operation type of DNS to find the IP address of the hostname cisco.com. The DNS operation number 11 is scheduled to start immediately and run indefinitely. To view and interpret the results of an IP SLAs operation use the show ip sla monitor statistics command.
45
Sender
Receiver
Probes will cause a burden if overscheduled If multiple senders overwhelm one receiver, or if the device is already a bottleneck and its CPU utilization is high. Senders generally suffer more from the over-scheduling and frequency of probes. Probe scheduling can be problematic if the clock on the device is out of sync Reason synchronizing through Network Time Protocol (NTP) is highly recommended
46
Cisco Internetwork Performance Monitor (IPM) Several Cisco network management applications use IP SLAs One example is the Cisco Internetwork Performance Monitor (IPM) in CiscoWorks2000 RWAN bundle.
47
48
Network Performance Monitoring: Using IP SLA Monitor with Orion NPM http://www.youtube.com/watch?v=YKXoexOVsaE&feature=relat ed
49
SE1 SE2
With RPR, any of the following events triggers a switchover from the active to the standby Supervisor Engine: Route Processor (RP) or Switch Processor (SP) crash on the active Supervisor Engine. A manual switchover from the CLI. Removal of the active Supervisor Engine. Clock synchronization failure between Supervisor Engines. In a switchover, the redundant Supervisor Engine becomes fully operational and the following events occur on the remaining modules during an RPR failover: All switching modules are power-cycled. Remaining subsystems on the MSFC (including Layer 2 and Layer 3 protocols) are initialized on the prior standby, now active, Supervisor Engine. ACLs based on the new active Supervisor Engine are reprogrammed into the Supervisor Engine hardware.
RPR+ enhances Supervisor redundancy compared to RPR by providing the following additional benefits: Reduced switchover time: Depending on the configuration, the switchover time is in the range of 30 seconds to 60 seconds. No reloading of installed modules: Because both the startup configuration and the running configuration stay continually synchronized from the active to the redundant Supervisor Engine during a switchover, no reloading of line modules occurs. Synchronization of Online Insertion and Removal (OIR) events between the active and standby: This occurs such that modules in the online state remain online and modules in the down state remain in the down state after a switchover.