You are on page 1of 13

Accessing the WAN (Network Troubleshooting)

• Once Network is operational, Administrators monitors its performance for company's


productivity
• From time-to-time Network Outages (Sometime planned & Sometimes unplanned) can occur
• In unexpected network outage, administrator troubleshoot the network
Documenting Your Network
• To efficiently diagnose & correct network problems, its documentation (each & everything)
required
• This information is called the Network Baseline
• These documentations having Configuration Tables & Topology Diagrams
• It provides logical diagram of network & detailed information about each component
• Kept this info safely on single location, (either as hard or soft copy), having these
components:
 Network Configuration Table – Contains accurate, up-to-date records of hardware &
software, provide all the info necessary to identify & correct network fault, the Table illustrates
the data set that should be included for all components:
 Type of device, model designation
 IOS image name
 Device network hostname
 Location of device (building, floor, room, rack, panel)
 If it is a modular device, include all module types & in which module slot they are located
 Data Link Layer Addresses
 Network Layer Addresses
 Any additional important information about physical aspects of the device
 End-System Configuration Table – Contains baseline records of hardware & software used
in end-system devices (Servers, Network Management Consoles, & Desktop PCs), for
Troubleshooting Purposes, following information should be documented:
 Device name (purpose)
 OS (Operating System) and version
 IP address
 Subnet Mask
 Default gateway, DNS & WINS server addresses
 Any high-bandwidth network applications that the end-system runs
 Network Topology Diagram – Graphical representation of network to show logical
architecture, it shares many components with network configuration table; each network device,
logical & physical connection should represented with appropriate symbols, Routing Protocols
can also be shown, topology diagram should include:
 Symbols for All Devices & how they are Connected
 Interface Types & Numbers
 IP Addresses
 Subnet Masks

Network Documentation Process


• When documenting your network, gather information directly from Routers and Switches
• Commands useful to Network Documentation Process include:
 The ping test connectivity with neighboring devices; also initiates MAC address auto-
discovery process in same network
 The telnet is used to log in remotely to a device for accessing configuration information
 The show ip interface brief is used to display up or down status & IP address of all
interfaces
 The show ip route display routing table to learn directly connected neighbors, more remote
devices (through learned routes), and the routing protocols
 The show cdp neighbor detail discover directly connected Cisco neighboring devices

Why is Establishing a Network Baseline Important?


• Network Performance Baseline collects key performance data from ports & devices of network
• This is the "personality" of the network & provides answers to following questions:
 How does the Network Perform during a Normal or Average Day?
 Where are the Underutilized and Over-Utilized Areas?
 Where are the most Errors occurring?
 What Thresholds should be set for the Devices that need to be monitored?
 Can the Network Deliver the Identified Policies?

• Without Baseline, no standard to evaluate optimum nature of network traffic & congestion
levels
• Analysis after an initial baseline tends to reveal hidden problems
• Define areas which are underutilized & quite often can lead to network redesign efforts based
on quality and capacity observations
Planning for the First Baseline
• It’s important to plan for Baseline carefully
• Recommended steps for planning the first baseline:
 Step 1. Determine what types of data to collect – Start by Selecting few Variables that
represent the defined policies (if too many data points selected, it give huge amount of data,
which is difficult to handle), start with interface & CPU utilization, for this, can use the software
called WhatsUp Gold
 Step 2. Identify devices & ports of interest – now identify key devices & ports for which
performance data should be measured, Devices and ports of interest include:
 Network device ports that connect to other network devices
 Servers
 Key users
 Anything else considered critical to operations
 Step 3. Determine the baseline duration –
 The length of time for baseline information gathering is very crucial
 Normally baseline needs to last no more than 6 weeks
 Unless specific long-term trends need to measure
 Generally, a two-to-four-week baseline is adequate
 But can create baseline for Hour, Day or Week Basis
 Weekly trends are just as important as daily or hourly trends
 Sometimes the work week trends are too short to accurately reveal network picture
 Like some recurring operations occur every weekend’s to does database backup
 This recurring pattern is revealed in the monthly trend
 The yearly trend provide meaningful baseline performance details
 Baseline analysis should be conducted on a Regular Basis
 Perform annual analysis of network different sections on a rotating basis
 Analysis gives the picture to how the network is affected by growth and other changes
 Create Daily Graph for 5 minute average
 Weekly Graph for 30 minute average
 Monthly Graph for 2 hours average
 Yearly Graph for 1 day average

Measuring Network Performance Data


• Sophisticated Network Management Software creates baseline of large & complex networks
• “Fluke Network SuperAgent” module automatically creates & review reports
• This feature compares current performance levels with historical observations
• Automatically identify performance problems & applications that do not provide
expected service levels
• Network Management Software or Protocol Inspectors & Sniffers may run continuously
over the course of the data collection process
• Can collect info using show commands on mission-critical network devices
• Cisco IOS commands that can be used to manually collect device data:
 show version (shows uptime & version information for device software & hardware)
 show ip interface [brief] (shows all configuration of an interface)
brief keyword show only up / down status of IP interfaces with address of each interface
 show interface (shows detailed output for each interface)
 show ip route (shows the contents of Routing Table)
 show arp (shows the contents of ARP Table)
 show running-config (shows the current configuration stored in RAM)
 show port (shows the status of ports on a Switch)
 show vlan (shows the status of VLAN's on Switch)
 show tech-support (Runs other show commands & provides many pages of detailed output,
designed to be sent to technical support, also useful for other purposes)
Troubleshooting Methodologies and Tools
• Troubleshooting is time consuming process
• Having 2 Extreme Approaches, but it's result gives disappointment, delay, or failure
• One is Theorist or Rocket Scientist Approach & second one is Caveman Approach
• Rocket scientist analyzes & reanalyzes the situation until get the exact cause of problem
• This is fairly reliable, but takes long time (nobody can afford to down their networks for hours)
• Caveman’s first start swapping cards, cables, hardware, & software until miraculously the
network begins operating again, not very reliable, root cause of the problem may still be present
• Systematic Approach minimizes confusion and cuts down on time
OSI versus TCP/IP Layered Models
OSI Reference Model
• Provides a Common Language for network Engineers, used in troubleshooting networks
• Problems are typically described in terms of OSI layers
• Divides networking architecture into modular layers
• It describes how information from software application in one computer moves through
medium to software application in another computer
• Upper layers (5-7) deals with Application issues & generally implemented only in
software
• Application layer is closest to the end user
• Lower layers (1-4) handle data-transport issues
• Layers 3 & 4 are generally implemented only in software
• Physical & Data Link layers are implemented in hardware & software
• Physical layer is closest to the physical network medium (i.e. cabling)
TCP/IP Model
• Having 4 Layers, corresponds with 7 layers of OSI Reference Model
• Application layer combines the functions of the three OSI model layers
• It provides communication b/w applications i.e. FTP, HTTP, & SMTP on separate hosts
• Transport layers of TCP/IP and OSI directly correspond in function
• Transport layer is responsible for exchanging segments b/w devices on TCP/IP network
• TCP/IP Internet layer relates to OSI Network layer
• Internet layer responsible for placing messages in fixed format that allows devices to handle
them
• TCP/IP Network Access layer corresponds to OSI Physical & Data Link layers
• Network Access layer communicates directly with the network media
• It provides an interface between the architecture of the network and the Internet layer
Devices at OSI Layers:
Router : Layers 1 through 4
Firewall : Layers 1 through 4
Multilayer Switch : Layers 1 through 4
Standard Switch : Layers 1 and 2
Hub : Layer 1
End system : Layers 1 through 7
General Troubleshooting Procedures
• The stages of the general troubleshooting process are:
 Stage 1: Gather symptoms – Troubleshooting begins with gathering & documenting
symptoms from Network, End Systems, and Users, network administrator evaluates faulty
components, check baseline. Symptoms of Problems may Alerts, Console Messages, &
User Complaints, try to localize the problem to a smaller range of possibilities
 Stage 2: Isolate the problem – Problem is not truly isolated until a single problem, or a set of
related problems identified, to do this, network administrator examines the problems at the
logical layers of the network
 Stage 3: Correct the problem – After isolating & identifying the cause of problem, network
administrator starts working to correct the problem by Implementing, Testing, &
Documenting a Solution
Troubleshooting Methods
• Three main methods for troubleshooting networks:
 Bottom up – Starts with Physical Components of the network and move up through OSI
layers, it’s a good approach if the problem is suspected to be a physical one because most
networking problems reside at the lower levels, disadvantage with this approach is it requires
that you check every device and interface on the network & each conclusion and possibility
must be documented so there can be a lot of paper work associated with it, also determine from
which devices to start examining first
 Top down – Start with End-User Applications and move down through OSI layers, use this
approach for simpler problems or when you think the problem is with a piece of software,
disadvantage with this that it requires checking every network application, each conclusion and
possibility must be documented, also determine which application to start examining first
 Divide and conquer – With this, select a layer and test in both directions from starting
layer, also start by collecting user experience of the problem, document the symptoms & then,
make an informed guess as to which OSI layer to start your investigation, once you verify that a
layer is functioning properly, assume that the layers below it are functioning and work up the
OSI layers, if an OSI layer is not functioning properly, work your way down the OSI layer model,
for example, if users can't access web server and you can ping the server, this means the
problem is above Layer 3, & If you can't ping the server, then the problem is likely at a lower
OSI layer
Gathering Symptoms
• To determine the scope of the problem gather (document) the symptoms
• Each step in this process is briefly described here:
 Step 1. Analyze Existing Symptoms – Gathered from trouble ticket, users, or end systems
affected by the problem to form a definition of the problem
 Step 2. Determine Ownership – If the problem is within your system, can move onto the next
stage, & if its outside the boundary of your control, need to contact an Administrator
 Step 3. Narrow the Scope – Determine if problem is at the Core, Distribution, or Access
Layer of network, then analyze the existing symptoms & find out problematic equipment
 Step 4. Gather Symptoms from Suspect Devices – Gather hardware & software symptoms
from suspect devices; & check either problem is in hardware or in software configuration
 Step 5. Document Symptoms – Sometimes problem can solved using documented
symptoms
Use Cisco IOS commands to gather symptoms about the network
 debug command is good for gathering symptoms but it generates a large amount of console
message traffic and the performance of a network device can be noticeably affected, remember
to disable debugging when you are done.
Other Commands are:
ping {host | ip-address}: it sends echo request packet to an address
trace route {destination}: identifies the path a packet takes through the networks
telnet {host | ip-address}: connects to an IP address using Telnet application
show ip interface brief: displays summary of the status of all interfaces on a device
show ip route: displays current IP routing table
show running-config interface: displays currently running configuration file for particular interface
[no] debug: displays list of options for enabling or disabling debugging events on a device
show protocols: displays configured protocols with global & interface-specific status
Questioning End Users
Example Question: What does not work?
Example Question: Are the things that do work & the things that do not work related?
Example Question: Has the thing that does not work ever worked?
Example Question: When was the problem first noticed?
Example Question: What has changed since the last time it did work?
Example Question: Can you reproduce the problem?
Example Question: When exactly did the problem occur?
Software Troubleshooting Tools
• A wide variety of software and hardware tools are available to make troubleshooting
easier
NMS (Network Management System) Tools
• Include device-level monitoring, configuration, & fault management tools
• Example: What's Up Gold, CiscoView, HP Openview, and Solar Winds
Knowledge Bases
• On-line network device vendor knowledge bases is best sources of information
• Example: the Cisco Tools & Resources page found at http://www.cisco.com: free tool,
contains Troubleshooting Procedures, Implementation Guides, & Original White Papers
Baselining Tools
• Many tools are available for automating the network documentation and baselining process
• Available for Windows, Linux, AUX operating systems
• Example: SolarWinds LAN Surveyor & CyberGauge software
Protocol Analyzers
• Decodes various protocol layers in recorded frame & presents this info in relatively easy
format
• Example: Wireshark protocol analyzer
Hardware Troubleshooting Tools
Network Analysis Module (NAM)
• NAM can be installed in Cisco Catalyst 6500 Switches & 7600 Routers
• This provide a graphical representation of traffic from local & remote Switches & Routers
• NAM is embedded browser-based interface, capture and decodes packets and track response
times to pinpoint an application problem to the network or the server
Digital Multi-Meters (DMMs)
• Test instruments, used to directly measure electrical values of voltage, current, and resistance
• Checks power-supply voltage levels & either network devices receiving power or not
Example: Fluke Networks 179 Digital Multimeter
Cable Testers
• Specialized, handheld devices, testing various types of data communication cabling
• Detect broken wires, crossed-over wiring, shorted connections, & improperly paired
connections
• Having Inexpensive Continuity Testers, Moderately priced Data Cabling Testers, or Expensive
Time-Domain Reflectometers (TDRs)
• TDRs pinpoint the distance to break in a cable
• TDR send signals along the cable and wait for them to be reflected (then calculates the
distance)
• TDRs used to test fiber optic cables are known as Optical TDRs (OTDRs)
Example: Fluke Networks LinkRunner Pro testers or Fluke Networks CableIQ Qualification testers
Cable Analyzers
• Multifunctional handheld devices, test & certify copper & fiber cables
• More sophisticated tools include Advanced Troubleshooting Diagnostics that measure
distance to performance defect (NEXT, RL), identify corrective actions, graphically display
crosstalk & impedance behavior
• Cable Analyzers also include PC-based software
• Which collects the data & upload to PC where software creates up-to-date & accurate reports
Example: Fluke Networks DTX Cable Analyzer
Portable Network Analyzers
• Portable devices, Troubleshooting Switched Networks & VLANs
• Plug it anywhere on the network, shows Switch port to which the device is connected with
average and peak utilization
• Also used to discover VLAN configuration, identify top network talkers, analyze network traffic,
and view interface details
• The device can output to a PC, where network monitoring software does further analysis etc
Example: Fluke Networks OptiView Series III Integrated Network Analyzer
Common WAN Implementation Issues
• WAN data transfer speed is considerably slower than common LAN bandwidth
• It's also costly; user (clients) wants more service access at higher speeds
• This is necessary to provide WAN links with reduce cost & optimal performance
• WANs carry a variety of traffic (data, voice, and video)
• Design it to provide adequate Data capacity, consider the topology nature of those
connections
• WAN technologies function at lower three layers of OSI
• Routers determine the most appropriate path
• Routers also provide QoS management, which allots priorities to the different traffic streams
• CSU/DSU's are the WAN Modems
Steps in WAN Design
• Businesses install WAN connectivity to move data between external branches
• Each time a modification to an existing WAN is considered, these steps should be followed
Step 1: Locate LANs – Establish source & destination endpoints that connect through WAN
Step 2: Analyze Traffic – Know what data traffic must be carried, its origin, and its destination
because different traffic has varying requirements for bandwidth, latency, and jitter
Step 3: Plan the Topology – Topology influenced by geographic considerations & availability, this
may include extra links for redundancy and load balancing
Step 4: Estimate the Required Bandwidth – Traffic have varying requirements for latency & jitter
Step 5: Choose the WAN Technology – Suitable link technologies must be selected
Step 6: Evaluate Costs – Installation & operational costs for WAN must be in mind
WAN Traffic Considerations
• The table shows the wide variety of traffic types and their varying requirements:
Traffic Types:
Traffic: Latency: Jitter: Bandwidth:
Voice Low Low Medium
Transaction Data (e.g., SNA) Medium Medium Medium
Messaging (e-mail) High High High
File Transfer High High High
Batch Data High High High
Network Management High High Low
Videoconferencing Low Low High
Jitter: Deviation in or displacement of some aspect of the pulses in a high-frequency digital signal

Traffic Characteristics:
Characteristic: Description:
Connectivity & Volume Flows Where does this traffic flow & how much traffic flows there?
Client / Server Data What kind of traffic flows b/w Client & Server?
Latency Tolerance, including Can the users tolerate delays? How much & how often?
Length & Variability
Network Availability Tolerance How critical is WAN availability to the users of this LAN?
Error-Rate Tolerance Is this noisy traffic?
Priority Does this traffic have priority over other traffic?
Protocol Type What types of protocols operate within the network?
Average Packet Length What is the average size of packets being transmitted?

WAN Topology Considerations


• WAN needs to design a suitable topology, the topology essentially consists of the following:
• Selecting an interconnection pattern or layout for the links between various locations
• Selecting technologies for those links to meet enterprise requirements at an acceptable cost
Many WANs uses Star Topology, branches are connected to head office, Star endpoints are
sometimes cross-connected, creating Mesh or Partial Mesh Topology, when designing, re-
evaluating, or modifying a WAN, carefully selects the topology, should consider in mind about Cost,
Multiple Paths between destinations, Redundancy & Reliability etc
When many locations must be joined, a Hierarchical Solution is recommended, Mesh Network is
not feasible because there would be hundreds of thousands of links, for hierarchical topology, Group
the LANs in each area and interconnected them to form a region, interconnect the regions to form the
core of the WAN
WAN Connection Technologies
• Typical Private WAN uses Combination of Technologies, based on Traffic Type & Volume
• ISDN, DSL, Frame Relay, or Leased Lines used to connect individual branches into an area
• Frame Relay, ATM, or leased lines used to connect external areas back to the backbone
• ATM or leased lines form the WAN backbone
• Dial-up links like PSTN, ISDN, or X.25 are not suitable for WANs (low response time / latency)
• Frame Relay & Atm are examples of Shared Networks with Reduced Cost
• ATM & Frame Relay networks carry traffic from several customers over the same internal
links
• The enterprise has no control over number of links or hops that data traverse in FR/ATM
• FR/ATM cannot control the time data must wait at each node before moving to the next link
• This uncertainty in latency & jitter makes FR/ATM unsuitable for some types of network traffic
• Although ATM is shared network, but it has minimal latency & jitter through high-speed
internal links sending ATM cells, having fixed length of 53 bytes
• Frame Relay may also control delay problem with QoS mechanisms
• Bandwidth = Available Bit Transfer Rate (bit rate capacity)

Technology: Charge: Typical Bit Rate: Other:


Leased Line Distance, Capacity up to 45 Mbps (E3/T3) Permanent Fixed Capacity
Basic Telephone Distance, Time 33 to 56 kbps Dialed, Slow Connection
ISDN Distance, Time 64 / 128 kbps (BRI) Dialed, Slow Connection
2 Mbps (PRI)
X.25 Volume (Dist. NP) up to 48 kbps Switched Fixed Capacity
ATM Capacity (Dist. NP) up to 155 Mbps PVC or SVC
Frame Relay Capacity (Dist. NP) up to 1.5 Mbps PVC or SVC
DSL Monthly Subscription up to 3 Mbps Always On shared Internet
Metro Ethernet Monthly Subscription up to 500 Mbps Limited Geographical scope

Common WAN Implementation Issues


• Common WAN implement issues & questions you need to answer before implementing WAN
Reliability? Our branch depends on the WAN, is Reliability essential?
Private or Public? Which infrastructure should I use?
Latency? Can delays be a problem for real-time traffic?
Confidentiality? Feasible to send sensitive company information to branches across the WAN?
Security? How do we protect ourselves from security threats over the WAN?
QoS? End-to-end QoS may be hard to obtain across the Internet?
Case Study: WAN Troubleshooting from an ISP's Perspective
• Test the link from Customer Edge Router to ISP Edge Router by asking the customer to log in
to their Router & send a hundred 1500 byte pings (Stress Pings) to the IP of ISP Edge Router
• ISP Representative can run Stress Pings from ISP Edge Router to Customer Edge Router
• In some cases the Slowness, may be caused by Server Congestion
Ask the Customer:
 What, if anything, has changed since before you started seeing this problem?
 Have you power-cycled (rebooted) the Router, Switch, PC, and Server?
 Would you be willing to do it again while I stay on the phone with you?
 Has there been a power outage, lightening strike, or power brown-out in your area recently?
 Do you have Up-to-Date Virus Software on your PC's?

Also do the following:


 Ask customers to fax or e-mail to you their network diagram
 Help customers isolate the different parts of the Internet
Interpreting Network Diagrams to Identify Problems
• It’s impossible to troubleshoot any type of network connectivity issue without network diagram
• Network must define IPs, IP Routes, show devices like Firewalls & Switches, and so on
• Generally, both Logical and Physical Topologies Aid in Troubleshooting
Physical Network Diagram Logical Network Diagram
• Physical Network Diagram typically • It shows how data transferred on the
includes: network
• Symbols used to represent network
 Device Type elements
 Model and Manufacturer • Logical Network Diagram may include:
 Operating System Version
 Cable Type and Identifier  Device identifiers
 Cable specification  IP Address and Subnet
 Connector Type  Interface Identifiers
 Cabling Endpoints  Connection Type
 DLCI for Virtual Circuits
 Site-to-Site VPNs
 Routing Protocols
 Static Routes
 Data-Link Protocols
 WAN Technologies used
Symptoms of Physical Layer Problems
• Layer 1 transmits bits from computer to another & regulates the transmission of stream of bits
• Failures and suboptimal conditions at Physical layer will be a Problem
• Physical layer problem occurs when physical properties of the connection are substandard
• If a problem with suboptimal operation at Physical layer, network may be operational, but
performance is consistently or intermittently lower than the level specified in the baseline
 Common symptoms of network problems at Physical layer include:
 Performance Lower than Baseline – Most common reasons for slow or poor performance
include overloaded or underpowered Servers, unsuitable Switch or Router configurations, traffic
congestion on a low-capacity link, and chronic frame loss
 Loss of Connectivity – If a cable or device fails; most obvious symptom is loss of
connectivity, as indicated by a simple ping test, Intermittent loss of connectivity could indicate
Loose or Oxidized Connection
 High Collision Counts – Collision Domain problems affect local medium & disrupt
communications to Layer 2 or Layer 3 infrastructure devices, Local Servers, or Services,
Collisions occurs due to bad cable, bad uplink cable, or a link that is exposed to External
Electrical Noise
 Network Bottlenecks or Congestion – If Router, interface, or cable fails, Routing Protocols
may redirect traffic to other routes that are not designed to carry the extra capacity, this can
result in Congestion or Bottlenecks
 High CPU Utilization Rates – High CPU utilization rates are problem, if not addressed
quickly, CPU overloading can cause a device to shut down or fail
 Console Error Messages – Error messages reported on the device console indicate a
Physical layer problem
Causes of Physical Layer Problems
 Power-related – Power-related issues are the most fundamental reason for network failure
 Hardware faults – Faulty NICs cause of errors due to Late Collisions, Short Frames, &
Jabber, “Jabber is the condition in which a network device continually transmits random,
meaningless data onto the network”, other causes of jabber are Faulty or Corrupt NIC Driver,
Bad Cabling, or Grounding Problems
 Cabling faults – Many Layer 1 problems can be corrected by simply reseating partially
disconnected cables or poorly crimped RJ-45s, Suspect cables should be tested or exchanged;
Fiber-Optic Problems caused by Dirty Connectors, Excessively Tight Bends, and Swapped
RX/TX Connections when Polarized; Coaxial Cable having Problems with connectors, having
problems if center conductor on coaxial cable end is not straight & of correct length
 Attenuation – Attenuated data bitstream is when Amplitude of bits is reduced,
Attenuation garbled transmission, major cause of Attenuation is if Cable length exceeds the
design limit for the media (e.g. Ethernet cable limit is 100 meters), or when there is poor
connection resulting from a loose cable or dirty or oxidized contacts
 Noise – Local Electro-Magnetic Interference (EMI) is commonly known as noise; 4 types of
noise:
 Impulse noise: caused by voltage fluctuations or current spikes induced on the cabling
 Random (white) noise: generated by many sources, i.e. FM Radio Stations, Police Radio,
Building Security, and Avionics for Automated Landing
 Alien crosstalk: noise induced by other cables in the same pathway
 Near End crossTalk (NEXT): originating from crosstalk from other adjacent cables, devices
with large electric motors, or anything that includes transmitter more powerful than cell phone
 Interface configuration errors – if interface is misconfigured, it will go down, configuration
errors that affect the Physical layer include:
 Serial links reconfigured as asynchronous instead of synchronous
 Incorrect clock rate
 Incorrect clock source
 Interface not turned on
 Exceeding design limits – A component may be operating suboptimally because it is being
utilized at a higher Average Rate than it is configured to operate
 CPU overload – With high CPU utilization percentages, input queue drops, slow
performance, Router services (Telnet / ping) are slow or fail to respond, or there are no routing
updates

 To isolate problems at Physical layers, do the following:


 Check for bad cables or connections – Verify that the cable from the source interface is
properly connected and is in good condition
 Check that the correct cabling standard is adhered to throughout the network – Verify
that the proper cable is being used
 Check that devices are cabled correctly – Check to make sure that all cables are
connected & any cross-connects are properly patched to the correct location
 Verify proper interface configurations – Check that all Switch ports are set in correct VLAN
& STP, Speed, and Duplex settings are correct, confirm that any active ports are not shut down
 Check operational statistics and data error rates – Use Cisco show commands to check
for statistics such as collisions and input and output errors
Symptoms of Data Link Layer Problems
 Troubleshooting Layer 2 problems can be a challenging process
• Data Link layer problems cause common symptoms that assist in identifying Layer 2 issues
• Common symptoms of network problems at the Data Link layer include:
 No functionality or connectivity at Network layer or above – Some Layer 2 problems can
stop exchange of frames across a link, & some cause network performance to degrade
 Network is operating below baseline performance levels – Two distinct types of
suboptimal Layer 2 operation that can occur in a network:
 Frames take illogical path to their destination but do arrive (due to poorly designed Layer 2
STP), Frames take longer & congested path to reach the Destination
 Some frames are dropped, in Ethernet environment, an extended or continuous ping also
reveals if frames are being dropped
 Excessive broadcasts – Modern OS does extensive broadcasts to discover network services
& other hosts; Generally, reasons of excessive broadcasts are:
 Poorly programmed or configured applications
 Large Layer 2 Broadcast Domains
 Underlying network problems, such as STP Loops or Route Flapping
 Console messages – Most common console message that indicates a Layer 2 problem is a
line protocol down message, reason are mismatch encapsulation, framing problems or
keepalives not arrive
Causes of Data Link Layer Problems
 Issues at layer 2 that commonly result in network connectivity or performance problems include:
 Encapsulation Errors – Sending & Receiving Bits are different because of different
Encapsulations (HDLC, Frame Relay, PPP etc)
 Address Mapping Errors – Static or dynamic maps needs to converts destination Layer 3
address with correct Layer 2 address in point-to-multipoint, Frame Relay, or broadcast Ethernet,
Static maps in Frame Relay does incorrect maps, dynamically this map fails due to:
 Devices may have not to respond to ARP or Inverse-ARP requests
 Layer 2 or Layer 3 information that is cached may have physically changed
 Invalid ARP replies are received because of a misconfiguration or a security attack
 Framing Errors – Framing error occurs when frame does not end on 8-bit byte boundary due
to noisy serial line, improperly designed cable (too long or not properly shielded), or an
incorrectly configured CSU line clock, problems with this are who to determine start & stop bit of
Frame, keepalives not exchanged etc
 STP Failures or Loops –Most STP problems revolve around these issues:
 Forwarding loops occur when no port is blocked in redundant topology
 Excessive flooding because of a high rate of STP topology changes, topology change should
be a rare event in a well-configured network, but when port is flapping, this causes repetitive
topology changes and flooding
 Slow STP convergence or reconvergence, caused by configuration error, like inconsistent
configuration of STP timers, overloaded Switch CPU during convergence, or software defect
Troubleshooting Layer 2 – PPP
Troubleshooting in Layer 2 technologies (PPP & FR) is difficult, (no Layer 3 troubleshooting tools, like
Ping), network technician needs thorough about these protocols & Cisco IOS commands
 Most PPP problems occur with link negotiation, steps for troubleshooting PPP:
 Step 1. – Check the encapsulation, use show interfaces serial command
 Step 2. – Confirm that the LCP negotiations have succeeded by checking the output for LCP
Open message, which indicates that the LCP negotiations have succeeded
 Step 3. – Verify authentication on both sides of the link using debug ppp authentication
Troubleshooting Layer 2 – Frame Relay
Troubleshooting FR Network can be broken down into 4 steps:
 Step 1. – Verify physical connection between CSU/DSU & the Router
 Step 2. – Verify that the Router & FR SP properly exchanging LMI information by using show
frame-relay lmi command
 Step 3. – Verify that the PVC status is active by using show frame-relay pvc command
 Step 4. – Verify FR Encapsulation matches on both Routers with show interfaces serial

Troubleshooting Layer 2 – STP Loops


Switches Loops prevented by STP
Switch should only have STP disabled if it is not part of a physically looped topology
 Can enable STP using spanning-tree vlan ID command
 Use these steps to troubleshoot forwarding loops:
 Step 1. Identify that STP loop is occurring, usual symptoms of forwarding loop are:
 Loss of connectivity to, from, and through the affected network regions
 High CPU utilization on Routers connected to affected segments or VLANs
 High link utilization (often 100 %)
 High Switch Backplane utilization (compared to the baseline utilization)
 Syslog messages indicate packet looping(e.g. HSRP duplicate IP messages)
 Syslog messages indicate constant address relearning or MAC address flapping messages
 Increasing number of output drops on many interfaces
 Step 2. Discover the topology (scope) of the loop, look at the ports with the highest link
utilization, use show interface for it
 Step 3. Break the loop, to do this, Shut down or disconnects involved ports, then check
whether the Switch backplane utilization is back to a normal level
 Step 4. Find and fix the cause of loop, First, investigate topology diagram, For every Switch
on the redundant path, check for these issues:
 Does Switch know the correct STP Root?
 Is Root Port identified correctly?
 Are BPDUs received regularly on Root Port & on Blocking Ports?
 Are BPDUs sent regularly on Non-Root, Designated Ports?
 Step 5. Restore the Redundancy, Restore the redundant links that were disconnected

Symptoms of Network Layer Problems


• These problems include any Layer 3 protocol related, including Routed & Routing Protocols
• Problems at Network layer can cause network failure or suboptimal performance
• Optimization issues in general can be more difficult to detect, isolate & diagnose
• Symptoms expanding from the Network Layer:
 Network failure
 Network performance below baseline

Troubleshooting Layer 3 Problems


• Badly configured Static Routes creates Routing Loops & Network become Unreachable
• Troubleshooting Dynamic Routing Protocols requires thorough understanding of it
• Routing problems are solved with a methodical process, using a series of commands
 Some areas to explore when diagnosing possible Routing Protocols Problems:
 General network issues: Often a change in the topology; may include the installation of new
routes, static or dynamic, removal of other routes, and so on, also Is there anyone currently
working on the network infrastructure?
 Connectivity issues: Check equipment & connectivity problems, including power problems &
overheating, also check cabling problems, bad ports, & ISP problems
 Neighbor issues: Check routing protocol adjacency issues with a neighbor
 Topology database: Check missing or unexpected entries in topology table or database
 Routing table: Check missing or unexpected routes in Routing Table, for this, use Debug

Transport Layer Troubleshooting


Common Access List Issues
• Network problems can arise normally at edge of the network where security implemented
• Two most commonly implemented Transport Layer Security Technologies are ACLs & NAT
 Most common issues with ACLs are caused by improper configuration:
 Selection of traffic flow: ACL must be applied to the correct interface & correct traffic
direction, if both ACLs & NAT is running on Router, the order is applied to a traffic flow is
important:
 Inbound traffic processed by inbound ACL before processed by outside-to-inside NAT
 Outbound traffic processed by outbound ACL after being processed by inside-to-outside NAT
 Order of access control elements: ACL Rules should be from specific to general
 Implicit deny all: No need to define denying traffic
 Addresses and wildcard masks: Complex wildcard masks provide significant improvements
in efficiency, so avoid configuration errors
 Selection of Transport layer protocol: It’s important to correctly specify Transport layer
protocols, not open TCP & UDP (both) ports when unsure, it's open loophole in firewall, put
extra burden on ACL, so ACL takes longer to process, introducing more latency
 Source and destination ports: Properly define Source and destination ports & addresses
 Use of the established keyword: The established keyword increases the security provided
by an ACL, however, if keyword is applied to outbound ACL, unexpected results may occur
 Uncommon protocols: Misconfigured ACLs often cause problems for VPN & encryption
protocols
 Troubleshooting ACL: Use log keyword with, this instructs Router to place an entry in
system log whenever that entry condition is matched, log keyword useful for troubleshooting &
provides information on intrusion attempts being blocked by the ACL
Common NAT Issues
• NAT configuration affects both inbound and outbound traffic
• The biggest problem with all NAT technologies is interoperability with other network
technologies
 Some of these technologies include:
 BOOTP and DHCP - DHCP-Request packet has source IP of 0.0.0.0, Because NAT requires
both valid destination & source IPs, BOOTP & DHCP can difficulty operates over static or
dynamic NAT Router, Configuring IP helper feature can help solve this problem
 DNS and WINS - Because Router running dynamic NAT is changing the relationship b/w
inside & outside addresses regularly; DNS or WINS server outside the NAT Router does not
have an accurate representation of the network inside the Router, IP helper can solve this
problem
 SNMP - Similar to DNS, NAT is not able to alter addressing information stored in the data
payload of packet, Because of this, SNMP management station on one side of NAT Router may
not be able to contact SNMP agents on the other side, use IP helper to solve the problem
 Tunneling and encryption protocols - IPsec & GRE cannot be processed by NAT, if
encryption or tunneling protocols must be run through NAT Router, need to create static NAT
entry for the required port for a single IP address on the inside of the NAT Router
Application Layer Overview
• Most of the Application layer protocols provide User Services
 The most widely known and implemented TCP/IP Application layer protocols include:
 Telnet - Enables users to establish terminal session connections with remote hosts
 HTTP - Supports exchange of text, graphics, sound, video, & other multimedia files on WEB
 FTP - Performs interactive file transfers between hosts
 TFTP - Performs basic interactive file transfers typically between hosts & networking devices
 SMTP - Supports basic message delivery services
 POP - Connects to mail servers and downloads e-mail
 SNMP - (UDP port 161) Collects management information from network devices
 DNS - Maps IP addresses to the names assigned to network devices
 NFS - NFS, XDR, RPC (UDP port 111) Enables computers to mount drives on remote hosts
and operate them as local drives, developed by Sun Microsystems, combines with 2 other
Application layer protocols, External Data Representation (XDR) & RPC
Symptoms of Application Layer Problems
• It is possible to have full network connectivity, but the application simply cannot provide data
 Some of the Possible Symptoms of Application Layer Problems
 User complaints about slow application performance
 Application error messages
 Console error messages
 System log file messages
 Network Management System alarms

Troubleshooting Application Layer Problems


• Concepts troubleshooting at this layer is same like other Layers, but should focus on Refused
or Timed out Connections, Access Lists, & DNS issues
 The steps for troubleshooting are as follows:
 Step 1. Ping the Default Gateway, if successful, Layer 1 & 2 services are functioning properly
 Step 2. Verify end-to-end connectivity, ping remote Network, if successful, no issue with Layer
3
 Step 3. Verify ACL and NAT operations, Clear ACL counters with clear access-list counters
command and try to establish a connection again & use debug ip nat & examine NAT output, if
ACLs and NAT are functioning as expected, the problem must lie in a higher layer
 Step 4. Troubleshoot upper layer protocol connectivity

Correcting Application Layer Problems


The steps for correcting Application layer problems are as follows:
 Step 1: Make a backup. Before proceeding, take a Backup, this provides for recovery to a
known initial state
 Step 2: Make an initial hardware or software configuration change. If the correction
requires more than one change, make only one change at a time
 Step 3: Evaluate and document each change and its results. If problem is intermittent,
wait to see if the problem occurs again before evaluating the effect of any change
 Step 4: Determine if the change solves the problem.
 Step 5: Stop when the problem is solved.
 Step 6: If necessary, get assistance from outside resources. This may be a CO-Worker, a
consultant, or Cisco Technical Assistance Center (TAC), on rare occasions, core dump may be
necessary, which creates output that a specialist at Cisco Systems can analyze
 Step 7: Document. Once the problem is resolved, document the solution

You might also like