You are on page 1of 17

Tag line, tag line

Cluster Switch
Troubleshooting

Randall Koonce And Sanjeev Shukla


Agenda

Introduction
Cluster switch cases supported by L1.
Cluster switch types
Useful Data to collect
Cluster Switch alert types
Fru Failures and Replacements
Additional Resources
Q&A

2009 NetApp. All rights reserved. NetApp Confidential - Internal Use only 2
Introduction
L1 will now be responsible for supporting the following types of
Cluster Switch autosupport Alret types:

SwitchPsuFanNotOperational_Alert

SwitchFanNotOperational_Alert

SwitchPowerNotOperational_Alert

SwitchPowerNotPresent_Alert

2009 NetApp. All rights reserved. NetApp Confidential - Internal Use only 3
CN1610 Overview

The CN1610 switch is used as a Clustering switch.


The switch has 24 ports
The switch has two redundant power supplies and two FRU fan
modules

2009 NetApp. All rights reserved. NetApp Confidential - Internal Use only 4
CN1601 Overview

The CN1601 is used as a Management switch in Clustered Ontap.

The switch is also a 24 port switch

Does not have any FRUS. So from a hardware perspective there


is very little we can do other than replace the switch.

2009 NetApp. All rights reserved. NetApp Confidential - Internal Use only 5
Cisco NX5010, NX5020, NX5596 Overview
The Cisco Nexus 5500 switch is a top-of-rack, 10-Gigabit Ethernet and Fibre channel over
Ethernet (FCoE).

The Cisco Nexus 5500 switch has the following features:


48 fixed 1- and 10-Gigabit Ethernet server connection ports on the back of the switch
Three slots on the back of the switch for optional expansion modules, which can be
either a 16-port
Two slots on the front of the switch for hot swap-capable power supplies, which provide
front-to-back airflow for cooling (the Cisco Nexus 5596T and 5596UP switches alternatively
support back-to-front [port-side intake] airflow)
Four slots on the front of the switch for hot swap-capable fan modules

2009 NetApp. All rights reserved. NetApp Confidential - Internal Use only 6
Data to collect

show enviroment: The most useful command for switch hardware issues.
Displays information about Fans, LEDs, temperature and power supply
information.

show tech-support: Very large amount of data output. Should be collected


only if you need to escalate the case.

2009 NetApp. All rights reserved. NetApp Confidential - Internal Use only 7
Data to collect
switch# show environment
Fan:
------------------------------------------------------
Fan Model Hw Status
------------------------------------------------------
Chassis-1 N5K-C5020-FAN -- ok
Chassis-2 -- -- absent
Chassis-3 N5K-C5020-FAN -- ok
Chassis-4 N5K-C5020-FAN -- ok
Chassis-5 N5K-C5020-FAN -- ok
PS-1 N5K-PAC-1200W -- failure
PS-2 N5K-PAC-1200W -- ok

Temperature
-----------------------------------------------------------------
Module Sensor MajorThresh MinorThres CurTemp Status
(Celsius) (Celsius) (Celsius)
-----------------------------------------------------------------
1 Outlet-1 60 50 41 ok
1 Outlet-2 60 50 44 ok
1 Outlet-3 60 50 36 ok
1 Outlet-4 60 50 39 ok
1 Intake-1 50 40 26 ok
1 Intake-2 50 40 25 ok
1 Intake-3 50 40 25 ok
1 Intake-4 50 40 25 ok
1 PS-1 60 50 20 ok
1 PS-2 60 50 27 ok
3 Outlet-1 60 50 30 ok
2 Outlet-1 60 50 32 ok

2009 NetApp. All rights reserved. NetApp Confidential - Internal Use only 8
Data to collect
Power Supply:
Voltage: 12 Volts
-----------------------------------------------------
PS Model Power Power Status
(Watts) (Amp)
-----------------------------------------------------
1 -- -- -- fail/shutdown
2 N5K-PAC-1200W 1200.00 100.00 ok
Mod Model Power Power Power Power Status
Requested Requested Allocated Allocated
(Watts) (Amp) (Watts) (Amp)
--- -----------___-------- ------- ---------- --------- ---------- --------
--
1 N5K-C5020P-BF-SUP 625.20 52.10 625.20 52.10 powered-
up
2 N5K-M1600 54.00 4.50 54.00 4.50 powered-
up
3 N5K-M1008 9.96 0.83 9.96 0.83 powered-

Power Usage Summary:


--------------------
Power Supply redundancy mode: Redundant
Power Supply redundancy operational mode: Non-redundant

Total Power Capacity 1200.00 W

Power reserved for Supervisor(s) 625.20 W


Power currently used by Modules 63.96 W

-------------
Total Power Available 510.84 W
-------------
2009 NetApp. All rights reserved. NetApp Confidential - Internal Use only 9
Autosupport types and Proceedures

SwitchPsuFanNotOperational_Alert

Probable Cause:
The chassis fan on the cluster switch is not functioning properly, is not
installed correctly, or has been removed.

Corrective Actions:

Verify that the fan module is inserted properly.


Replace if fan is determined to be defective.

2009 NetApp. All rights reserved. NetApp Confidential - Internal Use only 10
Autosupport types and Proceedures

SwitchPsuFanNotOperational_Alert

Probable Cause:
The chassis fan on the cluster switch is not functioning properly, is not
installed correctly, or has been removed.

Corrective Actions:

Verify that the fan module is inserted properly.


Replace if fan is determined to be defective.

2009 NetApp. All rights reserved. NetApp Confidential - Internal Use only 11
Autosupport types and Proceedures

SwitchPowerNotOperational_Alert

Probable Cause:
The power supply has failed, is not receiving power, or is not installed.

Corrective Action:

Verify the power supply is installed.


Verify the power supply is receiving power.
Replace the power supply if it is determined to be defective.

2009 NetApp. All rights reserved. NetApp Confidential - Internal Use only 12
Autosupport types and Proceedures

SwitchPowerNotPresent_Alert

Probable Cause:
The power supply has failed, is not receiving power, or is not installed.

Corrective Action:

Verify the power supply is installed.


Verify the power supply is receiving power.
Replace the power supply if it is determined to be defective.

2009 NetApp. All rights reserved. NetApp Confidential - Internal Use only 13
Fru Failures and Replacements

If one PSUs fails (with both PSUs present), the system issues
warning and sets system fans to maximum speed. The system
does not shut down due to single PSU failure exclusively.
If one out of two PSUs is removed, the system issues warning and
sets system fans to maximum speed. The system does not shut
down due to single PSU removal exclusively.
If a single system fan fails, the system issues warning and sets
remaining system fans to maximum speed. The system does not
shut down due to single system fan failure exclusively.
If two system fans fail, the system issues warning and sets
remaining system fans to maximum speed. Fan failures in the
same fan FRU or different fan FRU are not differentiated. The
system will shut down after 2 minutes if the two-fan-failure
condition persists.
If more than two fans fail, the system issues warning and shuts
down immediately.
The system can only sustain a single Power supply failure if only
two are originally equipped.
2009 NetApp. All rights reserved. NetApp Confidential - Internal Use only 14
Additional resources

All about the Asup types and corrective actions:

https://kb.netapp.com/support/s/article/autosupport-message-health-
monitor-process-cshm?t=1475601845266

Cluster switch template:

https://kb.netapp.com/support/s/article/triage-template-how-to-diagnose-
cisco-cluster-switch-issues-in-clustered-data-ontap-configurations

2009 NetApp. All rights reserved. NetApp Confidential - Internal Use only 15
When to escalate

Escalate the case to L2 if one of the following conditions occur:

1. A power supply replacement does not resolve the issue.

2. A fan replacement does not resolve the issue.

3. Anytime we suspect a potential switch replacement will be needed.

4. When your SSE2 feels it is best to escalate.

2009 NetApp. All rights reserved. NetApp Confidential - Internal Use only 16
Questions

2009 NetApp. All rights reserved. NetApp Confidential - Internal Use only 17

You might also like