You are on page 1of 370

Administration Guide

Hardware Platform Monitoring Guide

NetApp, Inc.
495 East Java Drive
Sunnyvale, CA 94089
U.S.

Telephone: +1 (408) 822-6000


Fax: +1 (408) 822-4501
Support telephone: +1 (888) 463-8277
Web: www.netapp.com
Feedback: doccomments@netapp.com

Part number: 215-06774_A0_ur003


July 2014

Table of Contents | 3

Contents
Sources of troubleshooting information ................................................... 26
Where LEDs appear .................................................................................................. 26
Where messages are displayed .................................................................................. 26
AutoSupport email messages help with troubleshooting .......................................... 27
Forms and use of diagnostic tools ............................................................................. 28
Where to find documentation .................................................................................... 28

Storage system LEDs .................................................................................. 30


20xx and SA200 system LEDs .................................................................................. 30
Location and meaning of LEDs on the front of 20xx and SA200 chassis .... 30
Location and meaning of LEDs on the back of 20xx and SA200
controller modules ................................................................................... 32
Location and meaning of 20xx and SA200 PSU LEDs ................................ 34
FAS22xx system LEDs ............................................................................................. 35
Location and meaning of LEDs on the front of 22xx chassis ....................... 35
Location and meaning of LEDs on the back of 22xx controllers .................. 36
Location and meaning of 22xx internal drive LEDs ..................................... 39
Location and meaning of 22xx PSU LEDs ................................................... 41
Location and meaning of 22xx internal FRU LEDs ..................................... 43
FAS25xx system LEDs ............................................................................................. 43
Location and meaning of LEDs on the front of FAS2520, FAS2552, and
FAS2554 chassis ..................................................................................... 43
Location and meaning of FAS25xx internal drive LEDs .............................. 45
Location and meaning of LEDs on the back of FAS2520 controllers .......... 47
Location and meaning of LEDs on the back of FAS255x controllers .......... 50
Location and meaning of FAS25xx PSU LEDs ............................................ 53
Location and meaning of FAS25xx internal FRU LEDs .............................. 55
SA300 system LEDs ................................................................................................. 56
Location and meaning of LEDs on the front of SA300 controllers .............. 56
Location and meaning of LEDs on the back of SA300 controllers .............. 57
Location and meaning of SA300 fan LEDs .................................................. 58
Location and meaning of SA300 PSU LEDs ................................................ 59
31xx system LEDs .................................................................................................... 60

4 | Hardware Platform Monitoring Guide


Location and meaning of LEDs on the front of 31xx chassis ....................... 60
Location and meaning of LEDs on the back of 31xx controllers .................. 62
Location and meaning of 31xx fan LEDs ..................................................... 63
Location and meaning of 31xx PSU LEDs ................................................... 63
Location and meaning of 31xx FRU LEDs ................................................... 65
32xx and SA320 system LEDs .................................................................................. 65
Location and meaning of LEDs on the front of 32xx and SA320 chassis .... 65
Location and meaning of LEDs on the back of 32xx and SA320
controllers ................................................................................................ 66
Location and meaning of LED on the back of 32xx and SA320 I/O
expansion modules .................................................................................. 69
Location and meaning of 32xx and SA320 fan LEDs .................................. 70
Location and meaning of 32xx and SA320 PSU LEDs ................................ 71
Location and meaning of 32xx and SA320 internal FRU LEDs ................... 72
60xx and SA600 system LEDs .................................................................................. 73
Location and meaning of LEDs on the front of 60xx and SA600
controllers ................................................................................................ 73
Location and meaning of LEDs on the back of 60xx and SA600
controllers ................................................................................................ 74
Location and meaning of 60xx and SA600 fan LEDs .................................. 75
Location and meaning of 60xx and SA600 PSU LEDs ................................ 76
62xx and SA620 system LEDs .................................................................................. 77
Location and meaning of LEDs on the front of 62xx and SA620 chassis .... 77
Location and meaning of LEDs on the back of 62xx and SA620
controllers ................................................................................................ 79
Location and meaning of the 62xx and SA620 I/O expansion module
LED ......................................................................................................... 83
Location and meaning of 62xx and SA620 fan LEDs .................................. 84
Location and meaning of 62xx and SA620 PSU LEDs ................................ 84
Location and meaning of 62xx and SA620 internal FRU LEDs ................... 85
FAS80xx system LEDs ............................................................................................. 86
Location and meaning of LEDs on the front of the FAS8020 chassis .......... 86
Location and meaning of LEDs on the front of FAS8040, FAS8060, and
FAS8080 chassis ..................................................................................... 88
Location and meaning of LEDs on the back of FAS8020 controllers .......... 89

Table of Contents | 5
Location and meaning of LEDs on the back of FAS8040, FAS8060, and
FAS8080 controllers ................................................................................ 92
Location and meaning of LEDs on the back of FAS80xx I/O expansion
modules .................................................................................................... 96
Location and meaning of FAS8020 fan LEDs .............................................. 98
Location and meaning of FAS8040, FAS8060, and FAS8080 fan LEDs ..... 98
Location and meaning of FAS80xx power supply LEDs ............................. 99
Location and meaning of FAS8020 internal FRU LEDs ............................ 100
Location and meaning of FAS8040, FAS8060, and FAS8080 internal
FRU LEDs ............................................................................................. 101
NVRAM adapter LEDs ........................................................................................... 101
Location and meaning of NVRAM5 and NVRAM6 LEDs ........................ 102
Location and meaning of NVRAM5 and NVRAM6 media converter
LEDs ...................................................................................................... 103
Location and meaning of NVRAM7 LEDs ................................................. 103
Location and meaning of NVRAM8 LEDs ................................................. 104
Location and meaning of NVRAM9 LEDs ................................................. 109

Adapter card LEDs .................................................................................. 112


Converged network adapter (CNA) and unified target adapter (UTA/UTA2)
LEDs .................................................................................................................. 112
Location and meaning of dual-port, 10-Gb, FCoE CNA HBA LEDs ........ 112
Location and meaning of dual-port, 16-Gb FC, 10-GbE/FCoE UTA2
LEDs ...................................................................................................... 115
Ethernet NIC LEDs ................................................................................................. 116
Location and meaning of single-port GbE NIC LEDs ................................ 116
Location and meaning of multiport GbE NIC LEDs .................................. 118
Location and meaning of LEDs on the dual-port 10-GbE NIC that
supports fiber optic cables with SFP+ modules or copper SFP+
cables ..................................................................................................... 122
Location and meaning of LEDs on the dual-port 10-GbE NIC that
supports fiber optic cables with X6569 SFP+ modules or copper SFP
+ cables .................................................................................................. 123
Location and meaning of single-port, 10-GbE NIC LEDs (2050 systems
only) ....................................................................................................... 124
Location and meaning of dual-port 10-GbE RJ45 NIC LEDs .................... 125
Flash Cache module and PAM LEDs ..................................................................... 127

6 | Hardware Platform Monitoring Guide


Location and meaning of PAM LEDs ......................................................... 127
Location and meaning of Flash Cache module LEDs ................................. 128
HBA LEDs .............................................................................................................. 129
Location and meaning dual-port Fibre Channel HBA LEDs ...................... 129
Location and meaning of dual-port, 4-Gb or 8-Gb, target-mode Fibre
Channel HBA LEDs .............................................................................. 130
Location and meaning of quad-port, 4-Gb, Fibre Channel HBA LEDs:
four-LED version ................................................................................... 132
Location and meaning of quad-port, 4-Gb, Fibre Channel HBA LEDs:
12-LED version ..................................................................................... 133
Location and meaning of quad-port, 8-Gb, Fibre Channel HBA LEDs:
12-LED version ..................................................................................... 135
Location and meaning of fiber-optic iSCSI target HBA LEDs .................. 137
Location and meaning of copper iSCSI target HBA LEDs ........................ 138
Location of dual-port, 3-Gb SAS HBA ports .............................................. 139
Location of quad-port, 3-Gb SAS HBA ports ............................................. 140
MetroCluster (FCVI) adapter LEDs ........................................................................ 142
Location and meaning of dual-port, 2-Gb MetroCluster adapter LEDs ..... 142
Location and meaning of dual-port, 4-Gb MetroCluster adapter LEDs ..... 143
Location and meaning of dual-port, 8-Gb MetroCluster adapter LEDs ..... 145
Location and meaning of dual-port, 16-Gb MetroCluster adapter LEDs . . . 147
TCP offload engine (TOE) NIC LEDs .................................................................... 148
Location and meaning of single-port TOE NIC LEDs ............................... 148
Location and meaning of quad-port TOE NIC LEDs ................................. 149
Location and meaning of dual-port, 10GBase-SR TOE NIC LEDs ........... 151
Location and meaning of dual-port, 10GBase-CX4 TOE NIC LEDs ......... 152

Startup messages ...................................................................................... 154


POST messages ....................................................................................................... 154
Boot messages ......................................................................................................... 155
FAS20xx and SA200 startup progress .................................................................... 155
Method of viewing progress on the console ................................................ 155
Method of viewing progress through the BIOS Status sensor .................... 156
31xx, 60xx, SA300, and SA600 system POST error messages .............................. 157
0200: Failure Fixed Disk ............................................................................. 157
0230: System RAM Failed at offset: ........................................................... 158
0231: Shadow RAM failed at offset ............................................................ 158

Table of Contents | 7
0232: Extended RAM failed at address line ................................................ 159
0235: Multiple-bit ECC error occurred ....................................................... 159
023C: Bad DIMM found in slot # ............................................................... 159
023E: Node Memory Interleaving disabled ................................................ 160
0241: Agent Read Timeout ......................................................................... 160
0242: Invalid FRU information ................................................................... 161
0250: System battery is dead ....................................................................... 161
0251: System CMOS checksum bad ........................................................... 162
0253: Clear CMOS jumper detected ........................................................... 162
0260: System timer error ............................................................................. 162
0280: Previous boot incomplete .................................................................. 162
02C2: No valid Boot Loader in System FlashNon Fatal ........................... 163
02C3: No valid Boot Loader in System FlashFatal ................................... 163
02F9: FPGA jumper detected ...................................................................... 163
02FA: Watchdog Timer Reboot (PciInit) .................................................... 164
02FB: Watchdog Timer Reboot (MemTest) ............................................... 164
02FC: LDTStop Reboot (HTLinkInit) ........................................................ 165
No message on console ............................................................................... 165
FAS22xx, FAS25xx, 32xx, 62xx, FAS80xx, SA320, and SA620 system POST
error messages ................................................................................................... 166
0200: Failure Fixed Disk ............................................................................. 166
0230: System RAM Failed at offset ............................................................ 166
0231: Shadow RAM Failed at offset ........................................................... 166
0232: Extended RAM Failed at address line ............................................... 166
023A: ONTAP Detected Bad DIMM in slot ............................................... 167
023B: BIOS detected SPD checksum error in DIMM slot: ........................ 167
023E: Node Memory Interleaving disabled ................................................ 167
0241: SMBus Read Timeout ....................................................................... 167
0242: Invalid FRU information ................................................................... 167
0250: System battery is dead - Replace and run SETUP ............................ 168
0251: System CMOS checksum bad ........................................................... 168
0260: System timer error ............................................................................. 168
0271: Check date and time settings ............................................................. 168
0280: Previous boot incomplete - Default configuration used .................... 169
02A1: SP Not Found ................................................................................... 169
02A2: System Error Log (SEL) Full ........................................................... 169

8 | Hardware Platform Monitoring Guide


02A3: No Response From SP To FRU ID Read Request ........................... 169
02C2: No valid Boot Loader in System Flash - Non Fatal ......................... 169
02C3: No valid Boot Loader in System Flash - Fatal ................................. 170
BIOS detected errors or invalid configuration in DIMM slot: .................... 170
BIOS detected pattern write/read mismatch in DIMM slot: ....................... 170
BIOS detected uncorrectable ECC error in DIMM slot: ............................. 171
BIOS detected unknown errors in DIMM slot: ........................................... 171
Fatal Error! All DIMM failed and system can not continue boot! .............. 171
Fatal Error! All channels are disabled! ....................................................... 171
Fatal Error! RDIMMs and UDIMMs are mixed! ........................................ 172
Fatal Error! UDIMM in 3rd slot is not supported! ...................................... 172
Fatal Error: No DIMM detected and system can not continue boot! .......... 172
No Response to Controller FRU ID Read Request via IPMI ...................... 173
No Response to Midplane FRU ID Read Request via IPMI ....................... 173
No message on the console ......................................................................... 173
SP FRU Entry is Blank or Checksum Error ................................................ 173
Software memory test failed! ...................................................................... 173
Boot error messages ................................................................................................ 174
Boot device err ............................................................................................ 174
Cannot initialize labels ................................................................................ 174
Cannot read labels ....................................................................................... 174
Configuration exceeds max PCI space ........................................................ 174
DIMM slot # has correctable ECC errors .................................................... 175
Dirty shutdown in degraded mode .............................................................. 175
Disk label processing failed ........................................................................ 175
Drive %s.%d not supported ......................................................................... 175
Error detection detected too many errors to analyze at once ...................... 175
FC-AL loop down, adapter %d ................................................................... 176
File system may be scrambled .................................................................... 176
Halted disk firmware too old ....................................................................... 177
Halted: Illegal configuration ....................................................................... 177
Invalid PCI card slot %d ............................................................................. 177
No /etc/rc ..................................................................................................... 177
No disk controllers ...................................................................................... 178
No disks ....................................................................................................... 178
No /etc/rc, running setup ............................................................................. 178

Table of Contents | 9
No network interfaces ................................................................................. 178
No NVRAM present .................................................................................... 178
NVRAM #n downrev .................................................................................. 179
NVRAM: wrong pci slot ............................................................................. 179
Panic: DIMM slot #n has uncorrectable ECC errors ................................... 179
This platform is not supported on this release ............................................. 179
Too many errors in too short time ............................................................... 180
Warning: Motherboard Revision not available ........................................... 180
Warning: Motherboard Serial Number not available .................................. 180
Warning: system serial number is not available .......................................... 180
Watchdog error ............................................................................................ 180
Watchdog failed .......................................................................................... 180

EMS and operational messages ............................................................... 182


Environmental EMS messages ................................................................................ 182
Chassis fan FRU failed ................................................................................ 182
Chassis over temperature on XXXX ........................................................... 183
Chassis over temperature shutdown on XXXX .......................................... 183
Chassis Power Degraded: 3.3V in warn high state ..................................... 183
Chassis power degraded: PS# ..................................................................... 184
Chassis Power Fail: PS# .............................................................................. 184
Chassis Power Shutdown ............................................................................ 184
Chassis power shutdown: 3.3V in warn low state ....................................... 185
Chassis Power Supply: PS# removed .......................................................... 185
Chassis power supply degraded: PS# .......................................................... 186
Chassis power supply fail: PS# ................................................................... 186
Chassis power supply off: PS# .................................................................... 186
Chassis power supply off: PS# .................................................................... 187
Chassis power supply OK: PS# ................................................................... 187
Chassis power supply removed: PS# .......................................................... 187
Chassis under temperature on XXXX ......................................................... 188
Chassis under temperature shutdown on XXXX ........................................ 188
Fan: # is spinning below tolerable speed .................................................... 188
monitor.chassisFan.degraded ...................................................................... 189
monitor.chassisFan.ok ................................................................................. 189
monitor.chassisFan.removed ....................................................................... 189
monitor.chassisFan.slow ............................................................................. 189

10 | Hardware Platform Monitoring Guide


monitor.chassisFan.stop .............................................................................. 190
monitor.chassisFan.warning ........................................................................ 190
monitor.chassisFanFail.xMinShutdown ...................................................... 190
monitor.chassisPower.degraded .................................................................. 190
monitor.chassisPower.ok ............................................................................. 191
monitor.chassisPowerSupplies.ok ............................................................... 191
monitor.chassisPowerSupply.degraded ....................................................... 191
monitor.chassisPowerSupply.notPresent .................................................... 191
monitor.chassisPowerSupply.off ................................................................. 192
monitor.chassisPowerSupply.ok ................................................................. 192
monitor.chassisTemperature.cool ................................................................ 192
monitor.chassisTemperature.ok .................................................................. 192
monitor.chassisTemperature.warm ............................................................. 192
monitor.cpuFan.degraded ............................................................................ 193
monitor.cpuFan.failed ................................................................................. 193
monitor.cpuFan.ok ...................................................................................... 193
monitor.ioexpansion.unpresent ................................................................... 194
monitor.ioexpansionPower.degraded .......................................................... 194
monitor.ioexpansionPower.ok ..................................................................... 194
monitor.ioexpansionTemperature.cool ........................................................ 194
monitor.ioexpansionTemperature.ok .......................................................... 195
monitor.ioexpansionTemperature.warm ..................................................... 195
monitor.nvmembattery.warninglow ............................................................ 195
monitor.nvramLowBattery .......................................................................... 195
monitor.power.unreadable ........................................................................... 196
monitor.shutdown.cancel ............................................................................ 196
monitor.shutdown.cancel.nvramLowBattery .............................................. 196
monitor.shutdown.chassisOverTemp .......................................................... 196
monitor.shutdown.chassisUnderTemp ........................................................ 197
monitor.shutdown.emergency ..................................................................... 197
monitor.shutdown.ioexpansionOverTemp .................................................. 197
monitor.shutdown.nvramLowBattery.pending ........................................... 197
monitor.temp.unreadable ............................................................................. 198
Multiple chassis fans have failed ................................................................ 198
Multiple fan failure on XXXX .................................................................... 198
Multiple power supply fans failed ............................................................... 199

Table of Contents | 11
nvmem.battery.capacity.low ....................................................................... 199
nvmem.battery.capacity.low.warn .............................................................. 199
nvmem.battery.capacity.normal .................................................................. 200
nvmem.battery.current.high ........................................................................ 200
nvmem.battery.current.high.warn ............................................................... 200
nvmem.battery.sensor.unreadable ............................................................... 200
nvmem.battery.temp.high ............................................................................ 201
nvmem.battery.temp.low ............................................................................. 201
nvmem.battery.temp.normal ....................................................................... 201
nvmem.battery.voltage.high ........................................................................ 202
nvmem.battery.voltage.high.warn ............................................................... 202
nvmem.battery.voltage.normal .................................................................... 202
nvmem.voltage.high .................................................................................... 202
nvmem.voltage.high.warn ........................................................................... 203
nvmem.voltage.normal ................................................................................ 203
nvram.bat.missing.error ............................................................................... 203
nvram.battery.capacity.low ......................................................................... 203
nvram.battery.capacity.low.critical ............................................................. 204
nvram.battery.capacity.low.warn ................................................................ 204
nvram.battery.capacity.normal .................................................................... 204
nvram.battery.charging.nocharge ................................................................ 204
nvram.battery.charging.normal ................................................................... 205
nvram.battery.charging.wrongcharge .......................................................... 205
nvram.battery.current.high .......................................................................... 205
nvram.battery.current.high.warn ................................................................. 206
nvram.battery.current.low ........................................................................... 206
nvram.battery.current.low.warn .................................................................. 206
nvram.battery.current.normal ...................................................................... 206
nvram.battery.end_of_life.high ................................................................... 207
nvram.battery.end_of_life.normal ............................................................... 207
nvram.battery.fault ...................................................................................... 207
nvram.battery.fault.warn ............................................................................. 207
nvram.battery.fcc.low .................................................................................. 208
nvram.battery.fcc.low.critical ...................................................................... 208
nvram.battery.fcc.low.warn ......................................................................... 208
nvram.battery.fcc.normal ............................................................................ 208

12 | Hardware Platform Monitoring Guide


nvram.battery.power.fault ........................................................................... 209
nvram.battery.power.normal ....................................................................... 209
nvram.battery.sensor.unreadable ................................................................. 209
nvram.battery.temp.high ............................................................................. 210
nvram.battery.temp.high.warn .................................................................... 210
nvram.battery.temp.low ............................................................................... 210
nvram.battery.temp.low.warn ...................................................................... 210
nvram.battery.temp.normal ......................................................................... 211
nvram.battery.voltage.high .......................................................................... 211
nvram.battery.voltage.high.warn ................................................................. 211
nvram.battery.voltage.low ........................................................................... 211
nvram.battery.voltage.low.warn .................................................................. 212
nvram.battery.voltage.normal ..................................................................... 212
nvram.hw.initFail ........................................................................................ 212
FCoE HBA EMS messages ..................................................................................... 213
ispcna.mpi.dump ......................................................................................... 213
ispcna.mpi.dump.saved ............................................................................... 213
ispcna.mpi.initFailed ................................................................................... 213
Flash Cache module and PAM module EMS messages ......................................... 214
callhome.flash.cache.failed ......................................................................... 214
extCache.io.BlockChecksumError .............................................................. 214
extCache.io.cardError .................................................................................. 214
extCache.io.readError .................................................................................. 215
extCache.io.writeError ................................................................................ 215
extCache.offline .......................................................................................... 215
extCache.ReconfigComplete ....................................................................... 215
extCache.ReconfigFailed ............................................................................ 216
extCache.ReconfigStart ............................................................................... 216
extCache.UECCerror ................................................................................... 216
extCache.UECCmax ................................................................................... 217
fal.chan.offline.comp ................................................................................... 217
fal.chan.online.erase.warn ........................................................................... 217
fal.chan.online.fail ....................................................................................... 217
fal.chan.online.read.warn ............................................................................ 218
fal.chan.online.rep.fail ................................................................................. 218
fal.chan.online.rep.part ................................................................................ 218

Table of Contents | 13
fal.chan.online.rep.succ ............................................................................... 219
fal.chan.online.rep.ver.err ........................................................................... 219
fal.chan.online.write.warn ........................................................................... 219
fal.init.failed ................................................................................................ 219
fmm.bad.block.detected .............................................................................. 219
fmm.device.stats.missing ............................................................................ 220
fmm.domain.card.failure ............................................................................. 220
fmm.domain.core.failure ............................................................................. 220
fmm.domain.lun.failure ............................................................................... 220
fmm.hourly.device.report ............................................................................ 221
fmm.log.bb .................................................................................................. 221
fmm.threshold.bank.degraded ..................................................................... 221
fmm.threshold.bank.offline ......................................................................... 221
fmm.threshold.card.degraded ...................................................................... 222
fmm.threshold.card.failure .......................................................................... 222
fmm.threshold.core.offline .......................................................................... 222
fmm.threshold.lun.offline ............................................................................ 222
iomem.bbm.bbtl.overflow ........................................................................... 223
iomem.bbm.init.failed ................................................................................. 223
iomem.bbm.new.flash ................................................................................. 223
iomem.card.disable ...................................................................................... 223
iomem.card.enable ...................................................................................... 224
iomem.card.fail.cecc ................................................................................... 224
iomem.card.fail.data.crc .............................................................................. 224
iomem.card.fail.desc.crc .............................................................................. 224
iomem.card.fail.dimm ................................................................................. 225
iomem.card.fail.firmware.primary .............................................................. 225
iomem.card.fail.fpga ................................................................................... 225
iomem.card.fail.fpga.primary ...................................................................... 226
iomem.card.fail.fpga.rev ............................................................................. 226
iomem.card.fail.internal .............................................................................. 227
iomem.card.fail.pci ...................................................................................... 227
iomem.card.fail.uecc ................................................................................... 227
iomem.dimm.log.checksum ........................................................................ 228
iomem.dimm.log.init ................................................................................... 228
iomem.dimm.log.read ................................................................................. 228

14 | Hardware Platform Monitoring Guide


iomem.dimm.log.sync ................................................................................. 228
iomem.dimm.log.write ................................................................................ 228
iomem.dimm.mismatch.banks ..................................................................... 229
iomem.dimm.mismatch.burst ...................................................................... 229
iomem.dimm.mismatch.casLatency ............................................................ 229
iomem.dimm.mismatch.columns ................................................................ 229
iomem.dimm.mismatch.dataWidth ............................................................. 230
iomem.dimm.mismatch.eccWidth ............................................................... 230
iomem.dimm.mismatch.ranks ..................................................................... 230
iomem.dimm.mismatch.rows ...................................................................... 230
iomem.dimm.mismatch.vendor ................................................................... 231
iomem.dimm.spd.banks ............................................................................... 231
iomem.dimm.spd.burst ................................................................................ 231
iomem.dimm.spd.casLatency ...................................................................... 231
iomem.dimm.spd.checksum ........................................................................ 232
iomem.dimm.spd.columns .......................................................................... 232
iomem.dimm.spd.dataWidth ....................................................................... 232
iomem.dimm.spd.detect .............................................................................. 232
iomem.dimm.spd.eccWidth ......................................................................... 233
iomem.dimm.spd.ranks ............................................................................... 233
iomem.dimm.spd.read ................................................................................. 233
iomem.dimm.spd.rows ................................................................................ 233
iomem.dma.crc.data .................................................................................... 234
iomem.dma.crc.desc .................................................................................... 234
iomem.dma.internal ..................................................................................... 234
iomem.dma.stall .......................................................................................... 234
iomem.ecc.cecc ........................................................................................... 235
iomem.ecc.correct.off .................................................................................. 235
iomem.ecc.correct.on .................................................................................. 235
iomem.ecc.detect.off ................................................................................... 235
iomem.ecc.detect.on .................................................................................... 236
iomem.ecc.inject .......................................................................................... 236
iomem.ecc.summary .................................................................................... 236
iomem.ecc.uecc ........................................................................................... 236
iomem.fail.stripe .......................................................................................... 237
iomem.firmware.package.access ................................................................. 237

Table of Contents | 15
iomem.firmware.primary ............................................................................ 237
iomem.firmware.program.complete ............................................................ 237
iomem.firmware.program.fail ..................................................................... 238
iomem.firmware.program.reboot ................................................................ 238
iomem.firmware.program.start .................................................................... 238
iomem.firmware.rev .................................................................................... 238
iomem.flash.mismatch.id ............................................................................ 239
iomem.fru.badInfo ....................................................................................... 239
iomem.fru.checksum ................................................................................... 239
iomem.fru.read ............................................................................................ 239
iomem.fru.write ........................................................................................... 240
iomem.i2c.link.down ................................................................................... 240
iomem.i2c.read.addrNACK ......................................................................... 240
iomem.i2c.read.dataNACK ......................................................................... 240
iomem.i2c.read.timeout ............................................................................... 241
iomem.i2c.write.addrNACK ....................................................................... 241
iomem.i2c.write.dataNACK ........................................................................ 241
iomem.i2c.write.timeout ............................................................................. 241
iomem.init.detect.fpga ................................................................................. 241
iomem.init.detect.pci ................................................................................... 242
iomem.init.fail ............................................................................................. 242
iomem.memory.flash.syndrome .................................................................. 242
iomem.memory.none ................................................................................... 242
iomem.memory.power.high ........................................................................ 243
iomem.memory.power.low ......................................................................... 243
iomem.memory.scrub.start .......................................................................... 243
iomem.memory.size .................................................................................... 243
iomem.memory.zero.complete .................................................................... 244
iomem.memory.zero.start ............................................................................ 244
iomem.nor.op.failed .................................................................................... 244
iomem.pci.error.config.bar .......................................................................... 244
iomem.pio.op.failed ..................................................................................... 244
iomem.remap.block ..................................................................................... 245
iomem.remap.target.bad .............................................................................. 245
iomem.temp.report ...................................................................................... 245
iomem.train.complete .................................................................................. 245

16 | Hardware Platform Monitoring Guide


iomem.train.fail ........................................................................................... 246
iomem.train.notReady ................................................................................. 246
iomem.train.start .......................................................................................... 246
iomem.vmargin.high ................................................................................... 246
iomem.vmargin.low .................................................................................... 246
iomem.vmargin.nominal ............................................................................. 247
monitor.extCache.failed .............................................................................. 247
monitor.flexscale.noLicense ........................................................................ 247
SAS EMS messages ................................................................................................ 247
ds.sas.config.warning .................................................................................. 247
ds.sas.crc.err ................................................................................................ 248
ds.sas.drivephy.disableErr ........................................................................... 248
ds.sas.element.fault ..................................................................................... 248
ds.sas.element.xport.error ............................................................................ 249
ds.sas.hostphy.disableErr ............................................................................ 249
ds.sas.invalid.word ...................................................................................... 250
ds.sas.loss.dword ......................................................................................... 250
ds.sas.multPhys.disableErr .......................................................................... 250
ds.sas.phyRstProb ........................................................................................ 251
ds.sas.running.disparity ............................................................................... 251
ds.sas.ses.disableErr .................................................................................... 251
ds.sas.xfer.element.fault .............................................................................. 252
ds.sas.xfer.export.error ................................................................................ 252
ds.sas.xfer.not.sent ...................................................................................... 252
ds.sas.xfer.unknown.error ........................................................................... 253
sas.adapter.bad ............................................................................................ 253
sas.adapter.bootarg.option ........................................................................... 253
sas.adapter.debug ........................................................................................ 254
sas.adapter.exception ................................................................................... 254
sas.adapter.failed ......................................................................................... 254
sas.adapter.firmware.download ................................................................... 254
sas.adapter.firmware.fault ........................................................................... 255
sas.adapter.firmware.update.failed .............................................................. 255
sas.adapter.not.ready ................................................................................... 255
sas.adapter.offline ........................................................................................ 256
sas.adapter.offlining .................................................................................... 256

Table of Contents | 17
sas.adapter.online ........................................................................................ 256
sas.adapter.online.failed .............................................................................. 256
sas.adapter.onlining ..................................................................................... 257
sas.adapter.reset ........................................................................................... 257
sas.adapter.unexpected.status ...................................................................... 257
sas.cable.error .............................................................................................. 257
sas.cable.pulled ............................................................................................ 258
sas.cable.pushed .......................................................................................... 258
sas.config.mixed.detected ........................................................................... 258
sas.device.invalid.wwn ................................................................................ 258
sas.device.quiesce ........................................................................................ 259
sas.device.resetting ...................................................................................... 259
sas.device.timeout ....................................................................................... 260
sas.initialization.failed ................................................................................. 260
sas.link.error ................................................................................................ 260
sas.port.disabled .......................................................................................... 261
sas.port.down ............................................................................................... 261
sas.shelf.conflict .......................................................................................... 261
sasmon.adapter.phy.disable ......................................................................... 262
sasmon.adapter.phy.event ........................................................................... 262
sasmon.disable.module ................................................................................ 263
shm.threshold.spareBlocksConsumed ......................................................... 263
shm.threshold.spareBlocksConsumedMax ................................................. 263
SES EMS messages ................................................................................................. 263
ses.access.noEnclServ ................................................................................. 263
ses.access.noMoreValidPaths ...................................................................... 264
ses.access.noShelfSES ................................................................................ 265
ses.access.sesUnavailable ............................................................................ 265
ses.badShareStorageConfigErr .................................................................... 266
ses.bridge.fw.getFailWarn ........................................................................... 266
ses.bridge.fw.mmErr ................................................................................... 266
ses.channel.rescanInitiated .......................................................................... 267
ses.config.drivePopError ............................................................................. 267
ses.config.IllegalEsh270 .............................................................................. 267
ses.config.shelfMixError ............................................................................. 268
ses.config.shelfPopError ............................................................................. 268

18 | Hardware Platform Monitoring Guide


ses.disk.configOk ........................................................................................ 268
ses.disk.illegalConfigWarn ......................................................................... 268
ses.disk.pctl.timeout .................................................................................... 268
ses.download.powerCyclingChannel .......................................................... 269
ses.download.shelfToReboot ...................................................................... 269
ses.download.suspendIOForPowerCycle .................................................... 269
ses.drive.PossShelfAddr .............................................................................. 270
ses.drive.shelfAddr.mm ............................................................................... 270
ses.exceptionShelfLog ................................................................................. 271
ses.extendedShelfLog .................................................................................. 271
ses.fw.emptyFile .......................................................................................... 272
ses.fw.resourceNotAvailable ....................................................................... 272
ses.giveback.restartAfter ............................................................................. 272
ses.giveback.wait ......................................................................................... 272
ses.psu.coolingReqError .............................................................................. 273
ses.psu.powerReqError ................................................................................ 273
ses.remote.configPageError ........................................................................ 273
ses.remote.elemDescPageError ................................................................... 274
ses.remote.faultLedError ............................................................................. 274
ses.remote.flashLedError ............................................................................ 274
ses.remote.shelfListError ............................................................................ 274
ses.remote.statPageError ............................................................................. 274
ses.shelf.changedID ..................................................................................... 275
ses.shelf.ctrlFailErr ...................................................................................... 275
ses.shelf.em.ctrlFailErr ................................................................................ 276
ses.shelf.IdBasedAddr ................................................................................. 276
ses.shelf.invalNum ...................................................................................... 276
ses.shelf.mmErr ........................................................................................... 277
ses.shelf.OSmmErr ...................................................................................... 277
ses.shelf.powercycle.done ........................................................................... 277
ses.shelf.powercycle.start ............................................................................ 277
ses.shelf.sameNumReassign ........................................................................ 278
ses.shelf.unsupportAllowErr ....................................................................... 278
ses.shelf.unsupportedErr ............................................................................. 278
ses.startTempOwnership ............................................................................. 279
ses.status.ATFCXError ............................................................................... 279

Table of Contents | 19
ses.status.ATFCXInfo ................................................................................. 279
ses.status.currentError ................................................................................. 279
ses.status.currentInfo ................................................................................... 280
ses.status.currentWarning ............................................................................ 280
ses.status.displayError ................................................................................. 280
ses.status.displayInfo ................................................................................... 281
ses.status.displayWarning ........................................................................... 281
ses.status.driveError .................................................................................... 281
ses.status.driveOk ........................................................................................ 282
ses.status.driveWarning ............................................................................... 282
ses.status.electronicsError ........................................................................... 282
ses.status.electronicsInfo ............................................................................. 283
ses.status.electronicsWarn ........................................................................... 283
ses.status.ESHPctlStatus ............................................................................. 283
ses.status.fanError ....................................................................................... 283
ses.status.fanInfo ......................................................................................... 284
ses.status.fanWarning .................................................................................. 284
ses.status.ModuleError ................................................................................ 284
ses.status.ModuleInfo .................................................................................. 284
ses.status.ModuleWarn ................................................................................ 285
ses.status.psError ......................................................................................... 285
ses.status.psInfo ........................................................................................... 285
ses.status.psWarning ................................................................................... 286
ses.status.temperatureError ......................................................................... 286
ses.status.temperatureInfo ........................................................................... 287
ses.status.temperatureWarning .................................................................... 287
ses.status.upsError ....................................................................................... 287
ses.status.upsInfo ......................................................................................... 288
ses.status.volError ....................................................................................... 288
ses.status.volWarning .................................................................................. 288
ses.system.em.mmErr .................................................................................. 289
ses.tempOwnershipDone ............................................................................. 289
sfu.adapterSuspendIO ................................................................................. 289
sfu.auto.update.off.impact ........................................................................... 289
sfu.ctrllerElmntsPerShelf ............................................................................ 290
sfu.downloadCtrllerBridge .......................................................................... 290

20 | Hardware Platform Monitoring Guide


sfu.downloadError ....................................................................................... 290
sfu.downloadingController .......................................................................... 290
sfu.downloadingCtrllerR1XX ..................................................................... 291
sfu.downloadStarted .................................................................................... 291
sfu.downloadSuccess ................................................................................... 291
sfu.downloadSummary ................................................................................ 291
sfu.downloadSummaryErrors ...................................................................... 291
sfu.FCDownloadFailed ............................................................................... 292
sfu.firmwareDownrev ................................................................................. 292
sfu.firmwareUpToDate ............................................................................... 292
sfu.partnerInaccessible ................................................................................ 292
sfu.partnerNotResponding ........................................................................... 293
sfu.partnerRefusedUpdate ........................................................................... 293
sfu.partnerUpdateComplete ......................................................................... 293
sfu.partnerUpdateTimeout ........................................................................... 294
sfu.rebootRequest ........................................................................................ 294
sfu.rebootRequestFailure ............................................................................. 294
sfu.resumeDiskIO ........................................................................................ 294
sfu.SASDownloadFailed ............................................................................. 295
sfu.statusCheckFailure ................................................................................ 295
sfu.suspendDiskIO ...................................................................................... 295
sfu.suspendSES ........................................................................................... 295
USB boot device EMS messages ............................................................................ 296
usb.adapter.debug ........................................................................................ 296
usb.adapter.exception .................................................................................. 296
usb.adapter.failed ........................................................................................ 296
usb.adapter.reset .......................................................................................... 297
usb.device.failed .......................................................................................... 297
usb.device.initialize.failed ........................................................................... 297
usb.device.maximum.connected ................................................................. 298
usb.device.protocol.mismatch ..................................................................... 298
usb.device.removed ..................................................................................... 299
usb.device.timeout ....................................................................................... 299
usb.device.unsupported ............................................................................... 299
usb.device.unsupported.speed ..................................................................... 300
usb.external.device.not.used ........................................................................ 300

Table of Contents | 21
usb.externalHub.notSupported .................................................................... 300
usb.port.error ............................................................................................... 300
usb.port.reset ............................................................................................... 301
usb.port.state.indeterminate ......................................................................... 301
usb.port.status.inconsistent .......................................................................... 301
usbmon.boot.device.failed ........................................................................... 302
usbmon.boot.device.pfa ............................................................................... 302
usbmon.disable.module ............................................................................... 302
usbmon.unable.to.monitor ........................................................................... 303
Operational error messages ..................................................................................... 303
Disk hung during swap ................................................................................ 303
Disk n is broken ........................................................................................... 304
Dumping core .............................................................................................. 304
Error dumping core ..................................................................................... 304
FC-AL LINK_FAILURE ............................................................................ 304
FC-AL RECOVERABLE ERRORS ........................................................... 304
Panicking ..................................................................................................... 305
RMC Alert: Boot Error ............................................................................... 305
RMC Alert: Down Appliance ..................................................................... 305
RMC Alert: OFW POST Error .................................................................... 305
UTA2 (CNA) error messages .................................................................................. 306
UTA2 (CNA) error messages on systems operating in maintenance
mode or Data ONTAP 7-Mode ............................................................. 306
UTA2 (CNA) error messages on systems running clustered Data
ONTAP .................................................................................................. 308

Service Processor messages ..................................................................... 311


When and how SP AutoSupport e-mail messages are sent ..................................... 311
What SP AutoSupport e-mail messages include ..................................................... 312
When and how SP EMS messages are sent ............................................................. 312
SP-generated AutoSupport messages ...................................................................... 312
HEARTBEAT_LOSS ................................................................................. 312
REBOOT (abnormal) .................................................................................. 313
SYSTEM_BOOT_FAILED (POST failed) ................................................ 313
USER_TRIGGERED (sp test) .................................................................... 313
USER_TRIGGERED (system nmi) ............................................................ 313
USER_TRIGGERED (system power cycle) ............................................... 314

22 | Hardware Platform Monitoring Guide


USER_TRIGGERED (system power off) ................................................... 314
USER_TRIGGERED (system reset) ........................................................... 314
EMS messages about the SP ................................................................................... 314
sp.firmware.upgrade.reqd ............................................................................ 314
sp.firmware.version.unsupported ................................................................ 315
sp.heartbeat.resumed ................................................................................... 315
sp.heartbeat.stopped .................................................................................... 315
sp.network.link.down .................................................................................. 316
sp.notConfigured ......................................................................................... 316
sp.orftp.failed .............................................................................................. 317
sp.snmp.traps.off ......................................................................................... 317
sp.userlist.update.failed ............................................................................... 317
spmgmt.driver.hourly.stats .......................................................................... 318
spmgmt.driver.mailhost ............................................................................... 319
spmgmt.driver.network.failure .................................................................... 319
spmgmt.driver.timeout ................................................................................ 319

RLM messages .......................................................................................... 321


When and how RLM AutoSupport e-mail messages are sent ................................. 321
What RLM AutoSupport e-mail messages include ................................................. 322
When and how RLM EMS messages are sent ........................................................ 322
RLM-generated AutoSupport messages .................................................................. 322
Heartbeat loss warning ................................................................................ 322
Reboot (power loss) critical ........................................................................ 323
Reboot (watchdog reset) warning ............................................................... 323
Reboot warning ........................................................................................... 323
RLM heartbeat loss ..................................................................................... 323
RLM heartbeat stopped ............................................................................... 324
System boot failed (POST failed) ............................................................... 324
User triggered (RLM test) ........................................................................... 324
User_triggered (system nmi) ....................................................................... 324
User_triggered (system power cycle) .......................................................... 325
User_triggered (system power off) ............................................................. 325
User_triggered (system power on) .............................................................. 325
User_triggered (system reset) ...................................................................... 325
EMS messages about the RLM ............................................................................... 325
rlm.driver.hourly.stats ................................................................................. 325

Table of Contents | 23
rlm.driver.mailhost ...................................................................................... 326
rlm.driver.network.failure ........................................................................... 326
rlm.driver.timeout ........................................................................................ 326
rlm.firmware.update.failed .......................................................................... 327
rlm.firmware.upgrade.reqd .......................................................................... 328
rlm.firmware.version.unsupported .............................................................. 328
rlm.heartbeat.bootFromBackup ................................................................... 329
rlm.heartbeat.resumed ................................................................................. 329
rlm.heartbeat.stopped .................................................................................. 329
rlm.network.link.down ................................................................................ 330
rlm.notConfigured ....................................................................................... 330
rlm.orftp.failed ............................................................................................ 331
rlm.snmp.traps.off ....................................................................................... 331
rlm.systemDown.alert ................................................................................. 331
rlm.systemDown.notice ............................................................................... 332
rlm.systemDown.warning ........................................................................... 332
rlm.systemPeriodic.keepAlive .................................................................... 333
rlm.systemTest.notice .................................................................................. 333
rlm.userlist.update.failed ............................................................................. 334

BMC messages .......................................................................................... 335


How and when BMC AutoSupport e-mail notifications are sent ............................ 335
What BMC e-mail notifications include ................................................................. 335
BMC-generated AutoSupport messages ................................................................. 335
BMC_ASUP_UNKNOWN ......................................................................... 336
REBOOT (abnormal) .................................................................................. 336
REBOOT (power loss) ................................................................................ 336
REBOOT (watchdog reset) ......................................................................... 336
SYSTEM_BOOT_FAILED (POST failed) ................................................ 336
SYSTEM_POWER_OFF (environment) .................................................... 337
USER_TRIGGERED (bmc test) ................................................................. 337
USER_TRIGGERED (system nmi) ............................................................ 337
USER_TRIGGERED (system power cycle) ............................................... 337
USER_TRIGGERED (system power off) ................................................... 337
USER_TRIGGERED (system power on) ................................................... 338
USER_TRIGGERED (system power soft-off) ........................................... 338
USER_TRIGGERED (system reset) ........................................................... 338

24 | Hardware Platform Monitoring Guide


EMS messages about the BMC ............................................................................... 338
bmc.asup.crit ............................................................................................... 338
bmc.asup.error ............................................................................................. 339
bmc.asup.init ............................................................................................... 339
bmc.asup.queue ........................................................................................... 339
bmc.asup.send ............................................................................................. 339
bmc.asup.smtp ............................................................................................. 340
bmc.batt.id ................................................................................................... 340
bmc.batt.invalid ........................................................................................... 340
bmc.batt.mfg ................................................................................................ 340
bmc.batt.rev ................................................................................................. 341
bmc.batt.seal ................................................................................................ 341
bmc.batt.unknown ....................................................................................... 341
bmc.batt.unseal ............................................................................................ 341
bmc.batt.upgrade ......................................................................................... 341
bmc.batt.upgrade.busy ................................................................................. 342
bmc.batt.upgrade.failed ............................................................................... 342
bmc.batt.upgrade.failure .............................................................................. 342
bmc.batt.upgrade.ok .................................................................................... 343
bmc.batt.upgrade.power-off ........................................................................ 343
bmc.batt.upgrade.voltagelow ...................................................................... 343
bmc.batt.voltage .......................................................................................... 343
bmc.config.asup.off ..................................................................................... 344
bmc.config.corrupted .................................................................................. 344
bmc.config.default ....................................................................................... 344
bmc.config.default.pef.filter ........................................................................ 344
bmc.config.default.pef.policy ...................................................................... 345
bmc.config.fru.systemserial ........................................................................ 345
bmc.config.mac.error .................................................................................. 345
bmc.config.net.error .................................................................................... 345
bmc.config.upgrade ..................................................................................... 346
bmc.power.on.auto ...................................................................................... 346
bmc.reset.ext ................................................................................................ 346
bmc.reset.int ................................................................................................ 346
bmc.reset.power .......................................................................................... 346
bmc.reset.repair ........................................................................................... 347

Table of Contents | 25
bmc.reset.unknown ...................................................................................... 347
bmc.sensor.batt.charger.off ......................................................................... 347
bmc.sensor.batt.charger.on .......................................................................... 347
bmc.sensor.batt.time.run.invalid ................................................................. 347
bmc.ssh.key.missing .................................................................................... 348

Additional LED error conditions ............................................................ 349


Clearing the fault LED when software is licensed but not enabled ........................ 349

Copyright information ............................................................................. 350


Trademark information ........................................................................... 351
How to send your comments .................................................................... 352
Index ........................................................................................................... 353

26 | Hardware Platform Monitoring Guide

Sources of troubleshooting information


Your storage system alerts you when problems occur and informs you of events that do not pose
problems. It does so with LEDs and messages that appear on your system console.
Monitoring messages and LEDs can help you prevent or correct problems on your system. You can
use this guide to determine the meaning of each message and LED.
The following systems are included in this guide:

FAS20xx and SA200


FAS22xx
FAS25xx
SA300
31xx
32xx and SA320
60xx
62xx and SA620
FAS80xx

Where LEDs appear


LEDs appear on the front of each system chassis, on the backs of controllers, on PSUs, and on fan
FRUs. They also appear on adapters that might be installed on your system.
LEDs for one system family differ from LEDs for another system family. For example, LEDs on
32xx systems differ from those on 80xx systems.

Where messages are displayed


Your system displays messages in different places, depending on the type of message.
The following table lists the types of messages your system might generate, and where you can see
them on your system:
Error message type

Where the type of message is displayed

POST error messages

System console

Boot error messages

System console

EMS environmental messages and other


operational messages

System console or LCD display

Sources of troubleshooting information | 27


Error message type

Where the type of message is displayed

RLM notifications about the system and EMS


messages about the RLM

AutoSupport email messages and the system


console

BMC notifications about the system and EMS


messages about the BMC

AutoSupport email messages and the system


console

SP notifications about the system and EMS


messages about the SP

AutoSupport email messages and the system


console

Your system also logs messages. See the System Administration Guide for the version of Data
ONTAP that your system is running for information about message logs.
Additional information about messages that appear on your system console or in logs may be
available through the Syslog Translator on the NetApp Support Site at support.netapp.com/eservice/
ems.

AutoSupport email messages help with troubleshooting


Your system has an AutoSupport feature, which sends email containing information about your
system to technical support. AutoSupport provides customized, real-time support to monitor the
performance of your system.
AutoSupport messages are generated and sent when specific events occur within a system or a
cluster. Messages also are sent weekly to provide support personnel information about system
performance. If necessary, technical support contacts you at the email address that you specify to
help resolve any potential system problem.
You also can have AutoSupport messages sent to addresses that you designate, such as those
belonging to your internal support organization.
Descriptions of the AutoSupport messages that you receive are available through the AutoSupport
Message Matrices page on the NetApp Support Site at support.netapp.com/NOW/knowledge/docs/
olio/autosupport/matrices/.
For information about configuring AutoSupport, see the System Administration Guide for the version
of Data ONTAP that your system is running.
Note: AutoSupport is enabled by default. You should keep it enabled because it can significantly
speed up the determination and resolution of problems if they occur on your system.

28 | Hardware Platform Monitoring Guide

Forms and use of diagnostic tools


Diagnostic tools enable you to troubleshoot problems with your storage system hardware. The forms
and uses of diagnostics differ, depending on your system model. You must understand how to use the
applicable form of diagnostics for your system.
The following diagnostic tools are available on different systems:
System-level You can find system-level diagnostics on FAS22xx, FAS25xx, 32xx, 62xx, and
FAS80xx systems by entering sldiag commands at the Maintenance mode prompt.
diagnostics
The sldiag commands enable you to specify devices, tests, and options; run
diagnostics based on the command; and then view the results. These commands are
documented in the relevant man pages and in the command reference documents on
the NetApp Support Site at mysupport.netapp.com.
Additional information about system-level diagnostics is available in the SystemLevel Diagnostics Guide on the NetApp Support Site at mysupport.netapp.com.
SYSDIAG
tool

The SYSDIAG tool is available on older systems by entering the boot_diags


command at the boot environment prompt and then navigating menu options.
The command boots the diagnostic program and then displays the Diagnostic
Monitor, the interface providing access to diagnostic menus. After you select and run
a test, the SYSDIAG tool generates a message and displays it on the system console
if the test finds an error.
Additional information about the SYSDIAG tool is available in the Diagnostics
Guide on the NetApp Support Site at mysupport.netapp.com.

Where to find documentation


Documentation is available for specific system families and disk shelves that might be attached to
your storage system. You can find documentation on the NetApp Support Site at
mysupport.netapp.com.
Platform or disk shelf type System or disk shelf model

Document

FAS systems

FAS20xx, FAS22xx, FAS25xx, 31xx,


32xx, 60xx, 62xx, and FAS80xx
systems

Hardware Platform
Monitoring Guide (This
guide)

FAS250 and FAS270 systems

FAS250/FAS270
Hardware and Service
Guide

Sources of troubleshooting information | 29


Platform or disk shelf type System or disk shelf model

Document

V-Series systems

31xx, 32xx, 60xx, 62xx systems

Hardware Platform
Monitoring Guide (This
guide)

FlexArray-compatible
systems

FAS80xx systems

Hardware Platform
Monitoring Guide (This
guide)

SA systems

SA200, SA300, SA320, SA600, and


SA620 systems

Hardware Platform
Monitoring Guide (This
guide)

Disk shelves

DS2246, DS4243, DS4246, and


DS4486

DS4243, DS2246,
DS4486, and DS4246 Disk
Shelf Installation and
Service Guide

DS14mk2 FC, and DS14mk4 FC

DiskShelf 14,
DiskShelf14mk2 FC, and
DiskShelf14mk4 FC
Hardware and Service
Guide

DS14mk2 AT

DiskShelf14mk2 AT
Hardware Service Guide

Switches, routers, storage subsystems,


and tape backup devices

Applicable third-party
hardware documentation

Third-party hardware

30 | Hardware Platform Monitoring Guide

Storage system LEDs


LEDs enable you to monitor your storage system and its components.
Each storage system platform has LEDs on the chassis, controller, fans, and PSUs. These LEDs
provide a high-level view of the status of your system and network activity.
Note: For information about disk shelf LEDs, see the appropriate disk shelf guide on the NetApp
Support Site at mysupport.netapp.com.

20xx and SA200 system LEDs


20xx and SA200 systems have LEDs that you can check to learn whether the system and its
individual components are turned on and are operating normally.
LEDs are visible on the front and the back of the system and on the power supply.

Location and meaning of LEDs on the front of 20xx and SA200 chassis
You can check the LEDs on the front of the system to learn whether the power is turned on, whether
there is activity on the controller, whether the system is halted, or whether there is a fault in the
chassis.
The following illustration shows the LEDs on the front of the 20xx and SA200 chassis:

Storage system LEDs | 31

Power LED

Fault LED

Controller module A LED

Controller module B LED

The following table explains what the LEDs on the front of the chassis mean:
Label

LED name

Status
indicator

Description

Power

Green

The system is receiving power.

Off

The system is not receiving power.

Amber

The system halted or a fault occurred in the


chassis. The error might be in a PSU, fan,
controller module, or internal disk. The LED
also is lit when there is an FRU failure, Data
ONTAP is not running on a controller module,
or the system is in Maintenance mode.

Off

The system is operating normally.

Green

The controller is operating and is active.

Blinking

This LED blinks in proportion to activity; the


greater the activity, the more frequently the LED
blinks. When activity is absent or very low, the
LED does not blink.

Off

No activity is detected.

Fault

A/B
(Controller A or
B)

Note: If an internal disk drive fails or is disabled, the fault light on the front of the chassis turns on.
When you remove the faulty or disabled disk drive, the fault light turns off. However, the failure
of disk drives in expansion disk shelves does not affect the fault light on the front of the chassis.

32 | Hardware Platform Monitoring Guide

Location and meaning of LEDs on the back of 20xx and SA200 controller
modules
You can check the LEDs on the back of the controller module to learn whether the controller module
is functioning properly, or to learn the status of the system network, disk shelf connections, or
NVMEM.
The following LEDs are on the back of the controller module:

Fibre Channel port


Remote management port
Ethernet port
NVMEM
Controller module fault

The following illustration shows the location of LEDs on the rear of 2050 and SA200 controller
modules:

The LEDs on the back of 2020 controller modules are the same as on the back of 2050 and SA200
controller modules, except for the placement of some labels.
The following illustration shows the location of LEDs on the back of 2040 controller modules:

The following table explains what the LEDs on the back of the controller modules mean:
Label

Port type

LED type

Fibre Channel LNK

SAS

LNK

Status indicator Description


Green

Link is established and


communication is happening.

Off

No link is established.

Green

Link is established on at least one


external SAS lane.

Off

No link is established on any


external SAS lane.

Storage system LEDs | 33


Label

Port type

LED type

Status indicator Description

Remote
management

LNK (Left)

Green

A valid network connection is


established.

Off

There is no network connection


present.

ACT
(Right)

Amber

There is data activity.

Off

There is no network activity


present.

LNK (Left)

Green

A valid network connection is


established.

Off

There is no network connection


present.

ACT
(Right)

Amber

There is data activity.

Off

There is no network activity


present.

NVMEM
status LED

Blinking green

NVMEM is in battery-backed
standby mode.

Off (power on)

The system is running normally,


and NVMEM is armed if Data
ONTAP is running.

Off (power off)

The system is shut down,


NVMEM is not armed, and the
battery is not enabled.

Ethernet

N/A

or

N/A

Controller
Amber
module fault
LED

Off

The controller module is starting


up, Data ONTAP is initializing,
the controller module is in
Maintenance mode, or a controller
module fault is detected.
The controller module is
functioning properly.

Attention: Do not replace DIMMs or any other system hardware when the NVMEM LED is

blinking. Doing so might cause you to lose data. Always flush NVMEM contents to disk by
entering a halt command at the system prompt before replacing the hardware.
Attention: To protect critical data in NVMEM, you cannot update BIOS or BMC firmware when

NVMEM is in use. Before updating firmware, ensure that NVMEM no longer contains critical

34 | Hardware Platform Monitoring Guide


data by performing a halt command to cleanly shut down Data ONTAP. When the system
reboots to the boot environment prompt, you can update your firmware.

Location and meaning of 20xx and SA200 PSU LEDs


You can check the LEDs on each PSU in your system to see whether the PSU has power and is
functioning properly.
The following illustration shows the location of the PSU LEDs, which are visible at the back of the
system.
Note: The illustration only shows a PSU found in 2050 and SA200 systems. The locations of PSU
LEDs in 2020 and 2040 systems are different, but the LEDs are functionally identical.

AC LED

Fault LED

The following table explains what the PSU LEDs mean:


Icon

LED name

LED color

Description

AC

Green

The AC input is good and the switch is on.

Off

The AC input is bad or the switch is off.

Storage system LEDs | 35


Icon

LED name

LED color

Description

Fault

Amber

The power supply is not functioning properly and


needs service. See the system console for any
applicable error messages.

Off

The power supply is functioning properly.

FAS22xx system LEDs


FAS22xx systems have LEDs that you can check to learn whether the system and its individual
components are turned on and are operating normally. FAS22xx systems include FAS2220 and
FAS2240 systems.
LEDs are visible on the front of the chassis, on the back of controllers, and on the PSUs.
FAS22xx systems are available in three models: the 2U FAS2220 system, the 2U FAS2240-2
system, and the 4U FAS2240-4 system.

Location and meaning of LEDs on the front of 22xx chassis


You can check the LEDs on the front of the chassis to learn whether the power is turned on, the
controller is active, the system is halted, or a fault in the chassis has occurred.
The following illustration shows the LEDs on the front of a 2220 or 2240-2 system with the bezel in
place:
1

LEDs

Shelf ID digital display

2240-4 systems have 4U chassis, but the placement and function of the LEDs are the same as on
2220 and 2240-2 systems.

36 | Hardware Platform Monitoring Guide


The following table shows what the LED labels look like and explains what the LEDs mean:
LED label

LED name

Status indicator

Description

Power

Green

Power is being supplied to the system.

Off

No power is being supplied to the system.

Amber

A fault has occurred in the controller, PSU, or


onboard storage, or Data ONTAP is not
running.

Off

The system is operating normally.

Fault

The shelf ID digital display shows the shelf ID of the chassis, which contains disk drives.
Note: If the 2220 or 2240 system has no attached disk shelves, then the chassis can have any ID

number. However, if disk shelves are attached, the chassis shelf and attached disk shelves must
have unique ID numbers.
When the bezel is removed, a third LED, indicating activity, is revealed below the fault LED. The
following table shows what the activity LED label looks like and explains what the LED means.
LED label

LED name

Status indicator

Description

Activity

Green

A link is established between the controller


and storage.

Location and meaning of LEDs on the back of 22xx controllers


You can check the LEDs on the back of the controller to learn the status of its network or disk shelf
connections, or, in an HA pair, to identify the controller where a fault occurred.
The following illustration shows the ports and LEDs on the back of the controller:

LNK

12

IOIOI

LNK

1a

0b

13 14
e0a

e0c

e0b

e0d

1b

0a
LNK

LNK

10 11

13 14

Storage system LEDs | 37

SAS port LEDs

SAS ports

Controller fault LED

NVMEM status LED

Optional mezzanine card LEDs (either 2/4/8 Gbps FC or 10 GbE) (2240 systems only)

Optional mezzanine card ports (either 2/4/8 Gbps FC or 10 GbE) (2240 systems only)

Serial port

USB port

Remote management Ethernet 10/100 Mb port LEDs

10

Remote management Ethernet 10/100 Mb port

11

Private management 10/100 Mb Ethernet port LEDs

12

Private management 10/100 Mb Ethernet port

13

GbE Ethernet port LEDs

14

GbE Ethernet port

If you have a 2240 system, the optional mezzanine card provides one of the following sets of ports:

Two 2/4/8 Gbps FC ports, each with one LNK LED


Two 10-GbE ports, each with one activity LED and one LNK LED

38 | Hardware Platform Monitoring Guide


The following table describes the meaning of the LEDs on the back of the controller:
Label

Name

Type

Status
indicator

Description

Serial
attached
SCSI (SAS)

Link

Green

Link is established on at least 1


external SAS lane.

Off

No link is established on any


external SAS lane.

Controller
fault

Activity

Amber

The controller module is starting up,


Data ONTAP is initializing, the
controller module is in Maintenance
mode, or a controller module fault is
detected.
Note: The LED might be
illuminated on both controllers.

Off
NVMEM

Fibre
Channel

Ethernet

NVMEM Blinking
status
green

Link

Link

Activity

The controller is functioning


properly.
NVMEM is in battery-backed
standby mode.

Off (power
on)

The system is running normally, and


NVMEM is armed if Data ONTAP
is running.

Green

A connection is established on the


port.

Off

No connection is established on the


port.

Green

A link is established between the


port and some upstream device.

Off

No link is established.

Blinking
amber

Traffic is flowing over the


connection.

Off

No traffic is flowing over the


connection.

Storage system LEDs | 39


Label

Name

Type

Remote
Link
management
and
Activity

Private
Link
management
and

Activity

Status
indicator

Description

Green

A link is established between the


port and some upstream device.

Off

No link is established.

Blinking
amber

Traffic is flowing over the


connection.

Off

No traffic is flowing over the


connection.

Green

A link is established between the


port and a downstream disk shelf.

Off

No link is established.

Blinking
amber

Traffic is flowing over the


connection.

Off

No traffic is flowing over the


connection.

Location and meaning of 22xx internal drive LEDs


When the bezel of the system is removed, you can view the LEDs on the internal disk drive carriers,
which indicate whether the disk drive is functioning normally.
The following illustration shows the front of a disk drive carrier in 2220 and 2240-4 systems and the
location of its two LEDs:
1

Activity LED

40 | Hardware Platform Monitoring Guide

Fault LED

The following illustration shows the front of a disk drive carrier in 2240-2 systems and the location
of its two LEDs:
1
2

Activity LED

Fault LED

Although the drive carriers differ in appearance, the behavior of the LEDs is the same. The following
table explains what the LEDs mean:
LED

LED color

Description

Activity

Solid green

The disk drive has power.

Blinking green

The disk drive has power, and I/O is in progress.

Storage system LEDs | 41


LED

LED color

Description

Fault

Solid amber

There is an error with the functioning of the disk drive.

Not illuminated

The disk drive is functioning normally.

Location and meaning of 22xx PSU LEDs


You can check the LEDs on each PSU to see whether its power is on, and whether the PSU and
integrated fan modules are working properly.
The PSUs on 2220 and 2240-2 systems are different from the PSUs on 2240-4 systems, but the PSU
LEDs function the same way.
The following illustration shows the location of PSU LEDs on the back of 2220 and 2240-2 systems:
1
2
3

AC

PSU OK

DC fault

AC fault

Fan fault

The following illustration shows the location of PSU LEDs on the back of the 2240-4 system:

42 | Hardware Platform Monitoring Guide


3
2

Fan fault

AC fault

PSU OK

DC fault

The following table describes what the PSU LEDs on 22xx systems mean:
Label

Name

Status indicator

Description

PSU OK

Green

The PSU is functioning normally.


Note: The other three LEDs are not
illuminated.

DC fault

Amber

The PSU cannot provide DC voltage to


the disk shelf within margin.

AC fault

Amber

The PSU is not turned on or the AC


power cord is not plugged in.

Fan fault

Amber

An error occurred with the function of


the fan.

Storage system LEDs | 43

Location and meaning of 22xx internal FRU LEDs


22xx systems contain LEDs inside the controller that assist in troubleshooting FRUs inside of them.
The following FRUs are in the controller and have LEDs on or near them:

DIMMs (2)
RTC battery
Boot media device
Mezzanine card

The FRU LEDs remain unlit when the FRU is functioning normally and turn amber when a problem
occurs. They stay lit for at least 10 minutes even after you remove the controller from the chassis.

FAS25xx system LEDs


FAS25xx systems have LEDs that you can check to learn whether the system and its individual
components are turned on and are operating normally.
LEDs are visible on the front of the chassis, on the back of controllers, and on the PSUs.
FAS25xx systems are available in three models: the 2U FAS2520 and FAS2552 systems, and the 4U
FAS2554 system.

Location and meaning of LEDs on the front of FAS2520, FAS2552, and


FAS2554 chassis
You can check the LEDs on the front of the chassis to learn whether the power is turned on, the
controller is active, the system is halted, or a fault in the chassis has occurred.
The following illustration shows the LEDs on the front of a FAS2520 or FAS2552 system with the
bezel in place:

The following illustration shows the LEDs on the front of a FAS2554 system with the bezel in place:

44 | Hardware Platform Monitoring Guide

LEDs

Shelf ID digital display

The LEDs are arranged vertically in the following top-to-bottom order:

Power
Attention

If two FAS2520 or FAS2552 controllers are installed in the chassis, Controller A is in the left bay,
and Controller B is in the right bay (when facing the rear of the chassis).
If two FAS2554 controllers are installed in the chassis, Controller A is in the top bay, and Controller
B is in the bottom bay.
FAS2520 and FAS2552 systems have 2U chassis, FAS2554 systems have 4U chassis. The placement
and function of the LEDs are the same on FAS2520, FAS2552, and FAS2554 systems.
The following table shows what the LED labels look like and explains what the LEDs mean:
LED label

LED name

Status indicator

Description

Power

Green

At least one of the two PSUs is


delivering power to the system.

Off

Neither PSU is delivering power to the


system.

Storage system LEDs | 45


LED label

LED name

Status indicator

Description

Attention

Amber

The system halted or a fault occurred in


the chassis. The error might be in a
PSU, fan, controller, or adapter. The
LED also is lit when there is a FRU
failure, Data ONTAP is not running on
a controller, or the system is in
Maintenance mode.
You can check the attention light on the
back of each controller to see where the
problem occurred.
Note: The attention light does not
illuminate when you remove the
controller from a dual-controller
system in an HA pair.

Off

The system is operating normally.

The shelf ID LCD display shows the shelf ID of the chassis, which contains disk drives.
Note: If the system has no attached disk shelves, the chassis can have any ID number. However, if
disk shelves are attached, the chassis shelf and attached disk shelves must have unique ID
numbers.

When the bezel is removed, a third LED, indicating activity, is revealed below the fault LED. The
following table shows what the activity LED label looks like and explains what the LED means:
LED label

LED name

Status indicator

Description

Activity

Green

A link is established between the controller


and storage.

Location and meaning of FAS25xx internal drive LEDs


When the bezel of the system is removed, you can view the LEDs on the internal disk drive carriers,
which indicate whether the disk drive is functioning normally.
The following illustration shows the front of a disk drive carrier in FAS2520 and FAS2554 systems
and the location of its two LEDs:

46 | Hardware Platform Monitoring Guide


1

1
2

Activity LED

Fault LED

The following illustration shows the front of a disk drive carrier in FAS2552 systems and the
location of its two LEDs:
1
2

Storage system LEDs | 47

Activity LED

Fault LED

Although the drive carriers differ in appearance, the behavior of the LEDs is the same. The following
table explains what the LEDs mean:
LED

LED color

Meaning

Activity

Solid green

The disk drive has power.

Blinking green

The disk drive has power, and I/O is in progress.

Solid amber

There is an error with the functioning of the disk drive.

Not illuminated

The disk drive is functioning normally.

Fault

Location and meaning of LEDs on the back of FAS2520 controllers


You can check the LEDs on the back of the controller to learn the status of its network or disk shelf
connections, or, in an HA pair, to identify the controller on which a fault occurred.
The following illustration shows the ports and LEDs on the back of the controller:

SAS ports

SAS port LEDs

NVMEM status LED

Controller attention LED

48 | Hardware Platform Monitoring Guide

10GBase-T data network ports

10GBase-T data network port LEDs

Serial port

USB port (external USB devices not currently supported)

Remote management 10/100/1000Base-T port

10

Remote management port LEDs

11

Private management 10/100/1000Base-T port

12

Private management port LEDs

13

1000Base-T data ports

14

1000Base-T data port LEDs

The following table describes the meaning of the LEDs on the back of the controller:
Label

Name

Type

Status
indicator

Meaning

Serial
attached
SCSI (SAS)

Link

Green

Link is established on at least one


external SAS lane.

Off

No link is established on any


external SAS lane.

Storage system LEDs | 49


Label

Name

Type

Status
indicator

Controller
attention

Attention Amber

Meaning
The controller module is starting up,
Data ONTAP is initializing, the
controller module is in Maintenance
mode, or a controller module fault is
detected.
Note: The LED might be
illuminated on both controllers.

Off
NVMEM

Ethernet

NVMEM Blinking
status
green

Link

Activity

Remote
Link
management
and
Activity

The controller is functioning


properly.
NVMEM is in battery-backed
standby mode.

Off (power
on)

The system is running normally, and


NVMEM is armed if Data ONTAP
is running.

Green

A connection is established on the


port.

Off

No connection is established on the


port.

Blinking
amber

Traffic is flowing over the


connection.

Off

No traffic is flowing over the


connection.

Green

A link is established between the


port and some upstream device.

Off

No link is established.

Blinking
amber

Traffic is flowing over the


connection.

Off

No traffic is flowing over the


connection.

50 | Hardware Platform Monitoring Guide


Label

Name

Type

Private
Link
management
and

Activity

Status
indicator

Meaning

Green

A link is established between the


port and a downstream disk shelf.

Off

No link is established.

Blinking
amber

Traffic is flowing over the


connection.

Off

No traffic is flowing over the


connection.

Location and meaning of LEDs on the back of FAS255x controllers


You can check the LEDs on the back of the controller to learn the status of its network or disk shelf
connections, or, in an HA pair, to identify the controller on which a fault occurred.
The following illustration shows the ports and LEDs on the back of the controller:

SAS ports

SAS port LEDs

NVMEM status LED

Controller fault LED

UTA2 (CNA) data network ports: 10GbE / 16Gb FC / 8Gb FC

Storage system LEDs | 51

UTA2 (CNA) data network port LEDs

Serial port

USB port (external USB devices not currently supported)

Remote management 10/100/1000Base-T port

10

Remote management port LEDs

11

Private management 10/100/1000Base-T port

12

Private management port LEDs

13

1000Base-T data ports

14

1000Base-T data port LEDs

The following table describes the meaning of the LEDs on the back of the controller:
Label

Name

Type

Status
indicator

Description

Serial
attached
SCSI (SAS)

Link

Green

Link is established on at least one


external SAS lane.

Off

No link is established on any


external SAS lane.

52 | Hardware Platform Monitoring Guide


Label

Name

Type

Status
indicator

Controller
attention

Attention Amber

Description
The controller module is starting up,
Data ONTAP is initializing, the
controller module is in Maintenance
mode, or a controller module fault is
detected.
Note: The LED might be
illuminated on both controllers.

Off
NVMEM

NVMEM Blinking
status
green

Fibre
Link
Channel (for
UTA2/CNA
ports
configured in
Fibre
Channel
mode)

Ethernet (for Link


dedicated
Ethernet
ports and
UTA2/CNA
ports
Activity
configured in
Ethernet
mode)

The controller is functioning


properly.
NVMEM is in battery-backed
standby mode.

Off (power
on)

The system is running normally, and


NVMEM is armed if Data ONTAP
is running.

Green

A connection is established on the


port.

Off

No connection is established on the


port.

Blinking
amber

Traffic is flowing over the


connection.

Off

No traffic is flowing over the


connection.

Green

A connection is established on the


port.

Off

No connection is established on the


port.

Blinking
amber

Traffic is flowing over the


connection.

Off

No traffic is flowing over the


connection.

Storage system LEDs | 53


Label

Name

Type

Remote
Link
management

Status
indicator

Description

Green

A connection is established on the


port.

Off

No connection is established on the


port.

Blinking
amber

Traffic is flowing over the


connection.

Off

No traffic is flowing over the


connection.

Green

A link is established between the


port and a downstream disk shelf.

Off

No link is established.

Blinking
amber

Traffic is flowing over the


connection.

Off

No traffic is flowing over the


connection.

and
Activity

Private
Link
management
and

Activity

Location and meaning of FAS25xx PSU LEDs


You can check the LEDs on each PSU to see whether its power is on and whether the PSU and
integrated fan modules are working properly.
The PSUs on FAS2520 and FAS2552 systems are different from the PSUs on FAS2554 systems, but
the PSU LEDs function the same way.
The following illustration shows the location of PSU LEDs on the back of FAS2520 and FAS2552
systems:

54 | Hardware Platform Monitoring Guide


1
2
3

AC

PSU OK

DC fault

AC fault

Fan fault

The following illustration shows the location of PSU LEDs on the back of the FAS2554 system:
3
1

Storage system LEDs | 55

Fan fault

AC fault

PSU OK

DC fault

The following table describes what the PSU LEDs on FAS25xx systems mean:
Label

Name

Status indicator

Meaning

PSU OK

Green

The PSU is functioning normally.


Note: The other three LEDs are not
illuminated.

DC fault

Amber

The PSU cannot provide DC voltage to


the disk shelf within margin.

AC fault

Amber

The PSU is not turned on or the AC


power cord is not plugged in.

Fan fault

Amber

An error occurred with the function of


the fan.

Location and meaning of FAS25xx internal FRU LEDs


FAS25xx systems contain LEDs inside the controller that assist in troubleshooting FRUs inside of
them.
The following FRUs are in the controller and have LEDs on or near them:

DIMMs (2)
NVMEM battery
RTC battery
Boot media device

Except for the FRU fault LEDs, the internal status LEDs are only visible when the controller module
is operating with the cover removed in an open chassis. The internal FRU LEDs remain unlit when
the FRU is functioning normally and turn amber when a problem occurs. They stay lit for at least 10
minutes even after you remove the controller from the chassis.

56 | Hardware Platform Monitoring Guide

SA300 system LEDs


SA300 systems have LEDs that you can check to learn whether the system and its components are
turned on and operating normally.
LEDs are visible on the front and rear of each system and on the power supplies.

Location and meaning of LEDs on the front of SA300 controllers


You can check the LEDs on the front of the controller to learn whether the power is turned on,
whether there is activity on the controller, whether the system is halted, and whether a fault has
occurred.
The following illustration shows the LEDs on the front of the controller:

Activity LED

Status LED

Power LED

The following table explains the meaning of the LEDs:


LED label

Status indicator

Description

Activity

Green

The system is operating and active.

Blinking

The system is actively processing data.

Off

No activity is detected.

Storage system LEDs | 57


LED label

Status indicator

Description

Status

Green

The system is operating normally.

Amber

The system halted or a fault occurred. The fault is


displayed in the LCD.
Note: This LED remains illuminated during boot, while
the operating system loads.

Power

Green

The system is receiving power.

Off

The system is not receiving power.

Location and meaning of LEDs on the back of SA300 controllers


You can check the LEDs on the back of the controller to learn the status of the controller network
connections.
The following LEDs are visible on the back of the controller:

FC port LEDs
GbE port LEDs
RLM LEDs

The following illustration shows the location of LEDs on the back of the controller:

FC port LEDs

GbE port LEDs

RLM LEDs

The following table explains what the LEDs on the back of the controller mean:

58 | Hardware Platform Monitoring Guide


Port type

LED type

Status
indicator

Description

FC

LNK

Off

No link with the Fibre Channel is established.

Green

A link is established.

On

A valid network connection is established.

Off

There is no network connection.

On

There is data activity.

Off

There is no network activity present.

GbE and
RLM

LNK

ACT

Location and meaning of SA300 fan LEDs


You can check the LED on each fan module FRU to determine problems that can occur in the FRU.
When the bezel is removed, the fan module FRUs and their LEDs are visible. The following
illustration shows the LED on a fan module FRU:
1

Fan module FRU LED

The fan module FRU LED is amber and illuminates when a problem occurs in the fan. If you see an
error message that indicates a fan problem, you can remove the bezel and use the illuminated fan
FRU LED to locate the FRU in which the problem occurred.

Storage system LEDs | 59

Location and meaning of SA300 PSU LEDs


You can check the LEDs on the PSUs to learn whether they are functioning normally.
The following illustration shows the location of the PSU LEDs on the back of the system:

PSU 1

PSU 2

PSU LEDs

The following table explains what the PSU LEDs mean:


LED label

Status indicator

Description

AC

Amber

No fault is indicated.

OK or Status

Green

AC

Off

OK or Status

Off

There is no external power; check the connections and the


power source.

60 | Hardware Platform Monitoring Guide


LED label

Status indicator

Description

AC

Amber

OK or Status

Off

The system displays the LOADER> prompt because it has


not booted Data ONTAP.

AC

Flashing amber

OK or Status

Amber

There is a power supply fault; replace the power supply.

31xx system LEDs


31xx systems have LEDs that you can check to learn whether the system and its individual
components are turned on and operating normally.
LEDs are visible on the front and rear of each system, and on the fan FRUs and the power supplies.

Location and meaning of LEDs on the front of 31xx chassis


You can check the LEDs on the front of the chassis to learn whether the power is turned on, the
controller is active, the system is halted, or a fault in the chassis has occurred.
The following illustration shows the LEDs on the front of the chassis:

LEDs on the front of the system

When the bezel is in place, the LEDs are arranged horizontally in the following left-to-right order:

Power

Storage system LEDs | 61

Fault
Controller A activity
Controller B activity

Controller A is the controller in the top of the chassis, and Controller B is the controller in the bottom
of the chassis.
Note: When the bezel is removed, the LEDs are arranged vertically in the following top-to-bottom

order:

Power
Fault
Controller A activity
Controller B activity

The following table shows what the LED labels look like and explains what the LEDs mean:
LED label

LED name

Status
indicator

Description

Power

Green

At least one of the two PSUs is delivering power to


the system.

Off

Neither PSU is delivering power to the system.

Amber

The system halted or a fault occurred in the chassis.


The error might be in a PSU, fan, or controller. The
LED also is lit when there is a FRU failure, Data
ONTAP is not running on a controller, or the system
is in Maintenance mode.

Fault

Note: You can check the fault light on the back of


each controller to see where the problem occurred.
Note: The fault light does not come on when you
remove the controller from a dual-controller
system in an HA pair.

A/B

Activity

Off

Both controllers are operating normally.

Blinking
green

Data ONTAP is running on the controller. The length


of time that the light remains on is proportional to the
controller's activity.

Off

Data ONTAP is not running on the controller.

62 | Hardware Platform Monitoring Guide

Location and meaning of LEDs on the back of 31xx controllers


You can check the LEDs on the back of the controller to learn the status of its network or disk shelf
connections, or, in an HA pair, to identify the controller where a fault occurred.
The following LEDs are visible on the back of the controller:

Ethernet port
Fault
FC port

The following illustration shows the location of the LEDs on the back of the controller:

The following table explains the behavior of the LEDs on the back of the controller:
LED
label

Type name

LED
type

Status
indicator

Description

Ethernet port

Link
(left)

Green

A link is established between the port and


some upstream device.

Off

No link is established.

Amber

Traffic is flowing over the connection.

Off

No traffic is flowing over the connection.

Green

A link is established between the port and


some upstream device.

Off

No link is established.

Activity
(right)

Amber

Traffic is flowing over the connection.

Off

No traffic is flowing over the connection.

Activity

Amber

The controller is the one causing the front


panel LED to be illuminated.

Activity
(right)

and

Management Link
port (Ethernet) (left)

Controller
fault

Note: This LED might be illuminated on


both controllers.

Off

The controller is functioning properly.

Storage system LEDs | 63


LED
label

Type name

LED
type

Status
indicator

Description

FC

Link

Green

A loop connection is established on the port.

Off

No loop connection is established on the port.

Location and meaning of 31xx fan LEDs


You can check the LED on each fan module FRU to pinpoint problems that can occur in the FRU.
When the bezel is removed, the fan module FRUs and their LEDs are visible. The following
illustration shows the LED on a fan module FRU:

Fan module FRU LED

The fan module FRU LED is amber and turns on when a problem occurs in the fan. If you see error
messages indicating a fan problem, you can remove the bezel and use the illuminated fan FRU LED
to locate the FRU where the problem occurred.

Location and meaning of 31xx PSU LEDs


You can check the LEDs on each AC PSU or DC PSU to see whether its power is on and whether the
PSU is working properly.
The following illustration shows the location of AC PSU LEDs on the back of the system. DC PSUs
have different power connectors, but their LEDs are the same.

64 | Hardware Platform Monitoring Guide

Fault LED

Power LED

The following table describes what the AC PSU and DC PSU LEDs mean:
PSU type

PSU condition

Power LED
status

Fault LED status

AC

PSU is present and switched on.


Normal mode.

Green

Off

PSU is missing or switched off. The


other PSU is off or functioning
normally.

Off

Off

PSU fault: AC in or -48VDC is out of


range, or there is a DC fault or fan
fault.

Off

Blinking amber

-48VDC
AC
-48VDC
AC
-48VDC

Storage system LEDs | 65

Location and meaning of 31xx FRU LEDs


31xx systems have 15 internal LEDs that assist in troubleshooting FRU problems.
Eleven LEDs are next to the FRUs on the controller board: (up to eight) DIMMs, CompactFlash,
RLM, and the RTC battery. When an LED is lit, it indicates that the FRU next to it needs to be
replaced.
Four LEDs are on the PCIe riser, one per PCIe slot. When one of the LEDs is lit, it indicates that
there is a problem with the card in that particular PCIe slot.
The FRU LEDs stay lit for at least 10 minutes even after you remove the controller from the system.

32xx and SA320 system LEDs


32xx and SA320 systems have LEDs that you can check to learn whether the system and its
individual components are turned on and are operating normally.
LEDs are visible on the front of the chassis, on the back of controllers and I/O expansion modules,
and on fan FRUs and power supplies.

Location and meaning of LEDs on the front of 32xx and SA320 chassis
You can check the LEDs on the front of the chassis to learn whether the power is turned on, the
controller is active, the system is halted, or a fault in the chassis has occurred.
The following illustration shows the LEDs on the front of the chassis:

LEDs

When the bezel is in place, the LEDs are arranged horizontally in the following left-to-right order:

Power
Fault
Controller A activity

66 | Hardware Platform Monitoring Guide

Controller B activity

When two controllers are installed in the chassis, Controller A is the controller in the top bay and
Controller B is the controller in the bottom bay. When a controller and an I/O expansion module are
installed in the chassis, the controller is always in the top bay and the I/O expansion module is
always in the bottom bay.
The following table shows what the LED labels look like and explains what the LEDs mean:
LED label

LED name

Status indicator

Description

Power

Green

Power is being supplied to the system.

Off

No power is being supplied to the system.

Amber

The system halted, or a fault occurred in the


chassis.

Off

The controllers are operating normally, or the


controller and the I/O expansion module are
operating normally.

Blinking green

Data ONTAP is running on the controller. The


length of time that the light remains on is
proportional to the controller's activity.

Fault

Controller A/B

Note: If an I/O expansion module is installed


in the chassis, the corresponding controller
activity LED is not lit.

Off

Data ONTAP is not running on the controller.

Location and meaning of LEDs on the back of 32xx and SA320 controllers
You can check the LEDs on the back of the controller to learn the status of its network or disk shelf
connections, or, in an HA pair, to identify the controller where a fault occurred.
The following illustration shows the ports and LEDs on the back of the controller:

Storage system LEDs | 67


3

c0a

0c

c0b

0d

e0a

0b

0a
LNK

LNK

e0b

10

11

12 13

SAS port LEDs

SAS ports

HA port LEDs (LEDs pointing up belong to the upper port; LEDs pointing down belong to
the lower port)

HA ports

Fibre Channel port LEDs (the LED pointing up belongs to the upper port; the LED
pointing down belongs to the lower port)

Fibre Channel ports

1-GbE port LEDs

1-GbE ports

Management Ethernet 10/100 Mb port LEDs

10

Private management 10/100 Mb Ethernet port

68 | Hardware Platform Monitoring Guide

11

USB (top) and serial console (bottom) ports (External USB devices are not currently
supported)

12

Controller fault LED

13

NVMEM LED

The following table describes the meaning of the LEDs on the back of the controller:
Label

Name

Type

Status
indicator

Description

Serial
attached
SCSI (SAS)

Link

Green

A link is established on at least 1


external SAS lane.

Off

No link is established on any


external SAS lane.

Fibre
Channel

Link

Green

A connection is established on the


port.

Off

No connection is established on the


port.

Green

A link is established between the


port and some upstream device.

Off

No link is established.

Amber

Traffic is flowing over the


connection.

Off

No traffic is flowing over the


connection.

Green

A link is established between the


port and some upstream device.

Off

No link is established.

Amber

Traffic is flowing over the


connection.

Off

No traffic is flowing over the


connection.

Ethernet

Link

Activity

Remote
Link
management
and
Activity

Storage system LEDs | 69


Label

Name

Type

Private
Link
management
and

Activity

Controller
fault

Activity

Status
indicator

Description

Green

A link is established between the


port and a downstream disk shelf.

Off

No link is established.

Amber

Traffic is flowing over the


connection.

Off

No traffic is flowing over the


connection.

Amber

A problem has occurred in the


controller. This in turn has caused
the system fault LED on the front of
the chassis to be illuminated.
Note: The LED might be
illuminated on both controllers.

Off
NVMEM

NVMEM Blinking
status
green
Off (power
on)

The controller is functioning


properly.
The NVMEM is in battery-backed
standby mode.
The system is running normally, and
NVMEM is armed if Data ONTAP
is running.

Location and meaning of LED on the back of 32xx and SA320 I/O expansion
modules
You can check the back of the I/O expansion module to detect whether a fault has occurred.
The following illustration shows the ports and LEDs on the back of an I/O expansion module:

70 | Hardware Platform Monitoring Guide

6
!
2

PCIe slots (labeled 3, 4, 5, and 6)

Fault LED

The following table describes the meaning of the LED on the I/O expansion module:
Label

Name

Type

Status
indicator

Description

I/O expansion
module fault

Activity

Amber

A fault has occurred.

Off

The I/O expansion module is


functioning normally.

Location and meaning of 32xx and SA320 fan LEDs


You can check the LED on each fan module FRU to pinpoint problems that can occur in the FRU.
When the bezel is removed, the fan module FRUs and their LEDs are visible. The following
illustration shows the LED on a fan module FRU:

Storage system LEDs | 71


1

LED

The fan module FRU LED is amber and illuminates when a problem occurs in the fan. If you see
error messages indicating a fan problem, you can remove the bezel and use the illuminated fan FRU
LED to locate the FRU where the problem occurred.

Location and meaning of 32xx and SA320 PSU LEDs


You can check the LEDs on each PSU to see whether its power is on and whether the PSU is
working properly.
The following illustration shows the location of PSU LEDs on the back of the system:

72 | Hardware Platform Monitoring Guide

Fault LED

Power LED

The following table describes what the PSU LEDs mean:


Power LED status

Fault LED
status

PSU condition

Green

Off

The PSU is present and switched on (normal mode).

Off

Off

The PSU is missing or switched off. The other PSU is off


or functioning normally.

Off

Blinking amber

PSU fault: AC in is out of range, or there is a DC fault or


fan fault.

Location and meaning of 32xx and SA320 internal FRU LEDs


32xx systems contain LEDs inside the controller and I/O expansion module that assist in
troubleshooting FRUs inside of them.
The following FRUs are in the controller and have LEDs on or near them:

DIMMs (up to 4)
RTC battery
USB device

Storage system LEDs | 73

PCIe slots (2)

The I/O expansion module has four PCIe slots, each with an LED.
The FRU LEDs remain unlit when the FRU is functioning normally and turn amber when a problem
occurs. They stay lit for at least 10 minutes even after you remove the controller or I/O expansion
module from the chassis.

60xx and SA600 system LEDs


60xxand SA600 systems have LEDs that you can check to learn whether the system and its
components are turned on and operating normally.
LEDs are visible on the front and rear of each system, and on the fan FRUs and the power supplies.

Location and meaning of LEDs on the front of 60xx and SA600 controllers
You can check the LEDs on the front of the controller to learn whether the power is turned on,
whether the system is active, whether the system is halted, or whether there is a fault in the chassis.
The following illustration shows the LEDs on the front of the controller.

1
2
3

Activity LED

Status LED

Power LED

The following table explains what the LEDs on the front of the controller mean:

74 | Hardware Platform Monitoring Guide


LED label

Status
indicator

Description

Activity

Green

The system is operating and active.

Blinking

The system is actively processing data.

Off

No activity is detected.

Green

The system is operating normally.

Amber

The system halted or a fault occurred. The fault is displayed in


the LCD.

Status

Attention: The LED remains lit during the boot process while
the operating system loads.

Power

Green

The system is receiving power.

Off

The system is not receiving power.

Location and meaning of LEDs on the back of 60xx and SA600 controllers
You can check the LEDs on the back of the controller to learn the status of network and disk shelf
connections.
The following illustration shows the location of LEDs on the back of the controller:
1

GbE port LEDs

RLM port LEDs

Storage system LEDs | 75

Fibre Channel port LEDs

The following table explains what the LEDs on the rear of the controller mean:
Port type

LED type

Status indicator

Description

Fibre Channel

LNK (Green)

Off

No link with the Fibre Channel is


established.

Blinking (6030
and 6070
systems)

A link is established and


communication is happening.

Solid (6040,
6080, and SA600
systems)
GbE and RLM

LNK

ACT

On

A valid network connection is


established.

Off

There is no network connection.

On

There is data activity.

Off

There is no network activity present.

Location and meaning of 60xx and SA600 fan LEDs


You can check the fan LEDs to learn whether the fan is functioning properly.
The following illustration shows the location of the fan LEDs, which you can see when you remove
the bezel from the system:
1

76 | Hardware Platform Monitoring Guide

Fan

LEDs

The following table describes the behavior of the fan LEDs:


LED status

Description

Orange blinking

The fan failed.

Off

There is no power to the system, or the fan is operational.

Location and meaning of 60xx and SA600 PSU LEDs


You can check the LEDs to learn whether the PSUs are providing power to your system and whether
they are functioning properly.
The following illustration shows the location of the PSU LEDs on your system:
1

LEDs

Power supply

The following table explains what the PSU LEDs mean:

Storage system LEDs | 77


Amber
(AC input)

Green
(PSU status)

Description

Corrective action

On

On

The AC power source is


good, and the PSU is
providing power to the
system.

N/A

On

Off

AC power is present, but


the PSU is not delivering
power to the system.

Ensure that the PSU is properly seated


and that its cables are connected and
secure.

On

Blinking

AC power is present, but


the power supply is not
enabled.

1. Log in to the RLM, and then enter


the following command:
system power on
Note: Using the system power
command might cause an
improper shutdown of the
storage system. During powercycling, a brief pause occurs
before power is turned back on.

2. If the problem persists, contact


technical support.
Off

Off

AC power is either not


present or not within
operational limits.

Check the AC switch, AC power


cable, and upstream circuit breakers.

62xx and SA620 system LEDs


62xx and SA620 systems have LEDs that you can check to learn whether the system and its
individual components are turned on and operating normally.
LEDs are visible on the front of the chassis, the rear of controllers and I/O expansion modules, and
on fan FRUs and power supplies.

Location and meaning of LEDs on the front of 62xx and SA620 chassis
You can check the LEDs on the front of the chassis to learn whether the power is turned on, the
controller is active, the system is halted, or a fault in the chassis has occurred.
The following illustration shows the LEDs on the front of the 62xx and SA620 chassis:

78 | Hardware Platform Monitoring Guide

Chassis LEDs

When the bezel is in place, the LEDs are arranged horizontally in the following left-to-right order:

Power
Fault
Controller A activity
Controller B activity

When two controllers are installed in the chassis, Controller A is the controller in the top bay, and
Controller B is the controller in the bottom bay. When a controller and an I/O expansion module are
installed in the chassis, the controller is always in the top bay and the I/O expansion module is
always in the bottom bay.
Note: When the bezel is removed, the LEDs are arranged vertically in the following top-to-bottom

order:

Power
Fault
Controller A activity
Controller B activity

The following table shows what the LED labels look like and explains what the LEDs mean:

Storage system LEDs | 79


LED label

LED name

Status indicator

Description

Power

Green

At least one of the two PSUs is


delivering power to the system.

Off

Neither PSU is delivering power to the


system.

Amber

The system halted or a fault occurred in


the chassis. The error might be in a
PSU, fan, controller, or I/O expansion
module. The LED also is lit when there
is a FRU failure, Data ONTAP is not
running on a controller, or the system is
in Maintenance mode.
You can check the fault light on the
back of each controller to see where the
problem occurred.

Fault

Note: The fault light does not come


on when you remove the controller
from a dual-controller system in an
HA pair.

Activity

Off

The system is operating normally.

Blinking green

Data ONTAP is running on the


controller. The length of time that the
light remains on is proportional to the
controller's activity.

Off

Data ONTAP is not running on the


controller.

Location and meaning of LEDs on the back of 62xx and SA620 controllers
You can check the LEDs on the back of the controller to learn the status of its network or disk shelf
connections, or, in an HA pair, to identify the controller where a fault occurred.
The following illustration shows the LEDs on left side of the back of the 62xx and SA620
controllers:

80 | Hardware Platform Monitoring Guide

3
0

e0a

e0b

e0c

LNK

Remote management port LEDs

Private management port LEDs

GbE port LEDs

Controller fault LED

Remote management port

Private management port

GbE port

10-GbE ports

10-GbE port LEDs

e0d

e0e

e0f

LNK LNK

LNK

Storage system LEDs | 81


The following table describes the meaning of the LEDs on left side of the back of the controller:
LED label

LED name

LED type

Status
indicator

Description

Fault

Activity

Amber

The controller is the one causing


the front panel fault LED to be
illuminated.
Note: The LED might be
illuminated on both
controllers.

Remote
management

Link (Left)

and
Activity
(Right)

Private
management
and

Port number
and

Link (Left)

Activity
(Right)

GbE

Link (Left)

Activity
(Right)

Off

The controller is functioning


properly.

Green

A link is established between the


port and some upstream device.

Off

No link is established.

Amber

Traffic is flowing over the


connection.

Off

No traffic is flowing over the


connection.

Green

A link is established between the


port and a downstream disk shelf.

Off

No link is established.

Amber

Traffic is flowing over the


connection.

Off

No traffic is flowing over the


connection.

Green

A link is established between the


port and some upstream device.

Off

No link is established.

Amber

Traffic is flowing over the


connection.

Off

No traffic is flowing over the


connection.

82 | Hardware Platform Monitoring Guide


LED label
Port number
and

LED name

LED type

Status
indicator

Description

10 GbE

Activity
(Top)

Amber

Traffic is flowing over the


connection.

Off

No traffic is flowing over the


connection.

Green

A link is established between the


port and some upstream device.

Off

No link is established.

Link
(Bottom)

The following illustration shows the location of ports and LEDs on the right side of the back of the
controller.

1
0a

0c

0b

LNK

LNK LNK

USB port

8-Gb Fibre Channel port LED

8-Gb Fibre Channel ports

Serial port

0d
LNK

Storage system LEDs | 83


The following table describes the meaning of the LEDs on the right of the back of the controller:
LED label
Port number
and

LED name

LED
type

Status indicator Description

8-Gb Fibre
Channel

Link

Green

A connection is established
on the port.

Off

No connection is established
on the port.

Location and meaning of the 62xx and SA620 I/O expansion module LED
You can check the back of the I/O expansion module to check whether a fault has occurred.
The following illustration shows the ports and LED on the back of an 62xx and SA620 I/O expansion
module:
2

Fault LED

PCIe slots

Vertical I/O slots

The following table describes the meaning of the LED on the I/O expansion module:

84 | Hardware Platform Monitoring Guide


LED label

LED name

LED type Status indicator

Description

Fault

Activity

Amber

A fault has occurred.

Off

The I/O expansion module is


operating properly.

Location and meaning of 62xx and SA620 fan LEDs


You can check the LED on each fan module FRU to pinpoint problems that can occur in the FRU.
When the bezel is removed, the fan module FRUs and their LEDs are visible. The following
illustration shows the LED on a fan module FRU:

LED

The fan module FRU LED is amber and illuminates when a problem occurs in the fan. If you see
error messages indicating a fan problem, you can remove the bezel and use the illuminated fan FRU
LED to locate the FRU where the problem occurred.

Location and meaning of 62xx and SA620 PSU LEDs


You can check the LEDs on each PSU to see whether its power is on and whether the PSU is
working properly.
The following illustration shows the location of PSU LEDs on the back of the system:

Storage system LEDs | 85

Fault LED

Power LED

The following table describes what the PSU LEDs mean:


Power LED status

Fault LED
status

PSU condition

Green

Off

The PSU is present and switched on (normal mode).

Off

Off

The PSU is switched off.

Off

Blinking amber

PSU fault: The AC in is out of range, or there is a DC


fault or fan fault.

Location and meaning of 62xx and SA620 internal FRU LEDs


62xx and SA620 systems contain LEDs near the FRUs inside the controller and the I/O expansion
module that assist in troubleshooting the FRUs.
The following FRUs LEDs are in the controller:

DIMMs (up to 12)


RTC battery
USB boot device
PCIe slots

86 | Hardware Platform Monitoring Guide

10-GbE slot
I/O slots (2)

The following FRU LEDs are in the I/O expansion module:

PCIe slots
I/O slots

FRU LEDs are off when the FRU is functioning normally and turn amber when a problem occurs.
They stay lit for at least 10 minutes even after you remove the controller or I/O expansion module
from the chassis.

FAS80xx system LEDs


FAS80xx systems have LEDs that you can check to learn whether the system and its individual
components are turned on and operating normally.
LEDs are visible on the front of the chassis, on the rear of controllers and I/O expansion modules,
and on fan FRUs and power supplies.
FAS80xx systems are available in four models: the 3U FAS8020, and the 6U FAS8040, FAS8060,
and FAS8080 systems.

Location and meaning of LEDs on the front of the FAS8020 chassis


You can check the LEDs on the front of the chassis to learn whether the power is turned on, the
controller is active, the system is halted, or a fault in the chassis has occurred.
The following illustration shows the LEDs on the front of the FAS8020 chassis:

Chassis LEDs

When the bezel is in place, the LEDs are arranged horizontally in the following left-to-right order:

Power
Attention

Storage system LEDs | 87

Controller A activity
Controller B activity

When two controllers are installed in the chassis, Controller A is the controller in the top bay, and
Controller B is the controller in the bottom bay.
Note: When the bezel is removed, the LEDs are arranged vertically in the following top-to-bottom

order:

Power
Attention
Controller A activity
Controller B activity

The following table shows what the chassis LED labels look like and explains what the LEDs mean:
LED label

LED name

Status indicator

Description

Power

Green

At least one of the two PSUs is


delivering power to the system.

Off

Neither PSU is delivering power to the


system.

Amber

The system halted or a fault occurred in


the chassis. The error might be in a
PSU, fan, controller, or I/O expansion
module. The LED also is lit when there
is a FRU failure, Data ONTAP is not
running on a controller, or the system is
in Maintenance mode.
You can check the attention light on the
back of each controller to see where the
problem occurred.

Attention

Note: The attention light does not


illuminate when you remove the
controller from a dual-controller
system in an HA pair.

Activity

Off

The system is operating normally.

Blinking green

Data ONTAP is running on the


controller. The length of time that the
light remains on is proportional to the
controller's activity.

Off

Data ONTAP is not running on the


controller.

88 | Hardware Platform Monitoring Guide

Location and meaning of LEDs on the front of FAS8040, FAS8060, and


FAS8080 chassis
You can check the LEDs on the front of the chassis to learn whether the power is turned on, the
controller is active, the system is halted, or a fault in the chassis has occurred.
The following illustration shows the LEDs on the front of the FAS8040, FAS8060, and FAS8080
chassis:

Chassis LEDs

When the bezel is in place, the LEDs are arranged horizontally in the following left-to-right order:

Power
Attention
Controller A activity
Controller B activity
Note: The Controller B activity LED does not illuminate on FAS80xx systems equipped with I/O
expansion modules (IOXM).

Controller A is the controller in the top bay, and Controller B is the controller in the bottom bay.
When the bezel is removed, the LEDs are arranged vertically in the following top-to-bottom order:

Power
Attention
Controller A activity

Storage system LEDs | 89

Controller B activity
Note: The Controller B activity LED does not illuminate on FAS80xx systems equipped with I/O
expansion modules (IOXM).

The following table shows what the chassis LED labels look like and explains what each one means:
LED label

LED name

Status indicator

Description

Power

Green

At least one of the two PSUs is


delivering power to the system.

Off

Neither PSU is delivering power to the


system.

Amber

The system halted or a fault occurred in


the chassis. The error might be in a
PSU, fan, controller, or I/O expansion
module. The LED also is lit when there
is a FRU failure, Data ONTAP is not
running on a controller, or the system is
in Maintenance mode.
You can check the attention LED on
the back of each controller to see where
the problem occurred.

Attention

Note: The attention LED does not


illuminate when you remove the
controller from a dual-controller
system in an HA pair.

Activity

Off

The system is operating normally.

Blinking green

Data ONTAP is running on the


controller. The length of time that the
light remains on is proportional to the
controller's activity.

Off

Data ONTAP is not running on the


controller.

Location and meaning of LEDs on the back of FAS8020 controllers


You can check the LEDs on the back of the controller to learn the status of the controller's network or
disk shelf connections, or, in an HA pair, to identify the controller where a fault occurred.
The following illustration shows the LEDs on the back of a FAS8020 controller.

90 | Hardware Platform Monitoring Guide

SAS port LEDs

SAS ports

10GbE port LEDs (LEDs pointing up belong to the upper port; LEDs pointing down
belong to the lower port)

10GbE ports

UTA2 (CNA) data network port LEDs (LEDs pointing up belong to the upper port; LEDs
pointing down belong to the lower port)

UTA2 (CNA) data network ports: 10GbE / 16Gb FC / 8Gb FC

10/100/1000Base-T data network port LEDs

10/100/1000Base-T data network ports

Management port LEDs: remote management (top) and private management (bottom)

10

Management ports: 10/100/1000Base-T remote management (top) and 10/100Base-T


private management (bottom)

11

USB (top) and serial (bottom) ports (External USB devices not currently supported)

12

NVRAM LED

Storage system LEDs | 91

13

Controller attention LED

The following table describes the meaning of the LEDs on the back of the controller:
Label

Name

Type

Status
indicator

Description

Serial
attached
SCSI (SAS)

Link

Green

A link is established on at least one


external SAS lane.

Off

No link is established on any


external SAS lane.

Ethernet (for
dedicated
Ethernet
ports and
UTA2/CNA
ports
configured in
Ethernet
mode)

Link
(Left)

Green

A link is established between the


port and some upstream device.

Off

No link is established.

Amber

Traffic is flowing over the


connection.

Off

No traffic is flowing over the


connection.

Fibre
Link
Channel (for
UTA2/CNA
ports
configured in
Fibre
Channel
mode)

Green

A connection is established to the


port.

Off

No connection is established to the


port.

Remote
Link
management (Left)

Green

A link is established between the


port and some upstream device.

Off

No link is established.

Amber

Traffic is flowing over the


connection.

Off

No traffic is flowing over the


connection.

Activity
(Right)

and
Activity
(Right)

92 | Hardware Platform Monitoring Guide


Label

Name

Type

Private
Link
management (Left)
and

Activity
(Right)

NVRAM

NV
Controller
attention

Status
indicator

Description

Green

A link is established between the


port and a downstream disk shelf.

Off

No link is established.

Amber

Traffic is flowing over the


connection.

Off

No traffic is flowing over the


connection.

NVRAM Blinking
status
green

NVRAM destage/restage events are


occurring.

Solid green

NVRAM destage/restage events


completed successfully.

Off (power
on)

The system is running normally, and


the NVRAM is armed if Data
ONTAP is running.

Attention Amber

A problem has occurred in the


controller. This problem has caused
the system attention LED on the
front of the chassis to be
illuminated.
Note: The LED might be
illuminated on both controllers.

Off

The controller is functioning


properly.

Location and meaning of LEDs on the back of FAS8040, FAS8060, and


FAS8080 controllers
You can check the LEDs on the back of the controller to learn the status of the controller's network or
disk shelf connections, or, in an HA pair, to identify the controller where a fault occurred.
The following illustration shows the LEDs on the left side of the back of the FAS8040, FAS8060,
and FAS8080 controllers:

Storage system LEDs | 93

1
LNK

LNK

0a

0b

S
A
S

LNK

LNK

0c

0d

LNK

e0a

LNK

LNK

e0b

e0c

LNK

e0d

7
LNK

LNK

LNK

e0e 0e

e0f 0f

LNK

e0g 0g e0h 0h

SAS port LEDs

SAS ports

10GbE port LEDs

10GbE ports

UTA2 (CNA) data network port LEDs

UTA2 (CNA) data network ports: 10GbE / 16Gb FC / 8Gb FC

NVRAM LED

Controller attention LED

NV

The following table describes the meaning of the LEDs on the left side of the back of the controller:

94 | Hardware Platform Monitoring Guide


Label

Name

Type

Status
indicator

Description

Serial
attached
SCSI (SAS)

Link

Green

A link is established on at least one


external SAS lane.

Off

No link is established on any


external SAS lane.

Ethernet (for
dedicated
Ethernet
ports and
UTA2/CNA
ports
configured in
Ethernet
mode)

Link
(Left)

Green

A link is established between the


port and some upstream device.

Off

No link is established.

Amber

Traffic is flowing over the


connection.

Off

No traffic is flowing over the


connection.

Green

A connection is established to the


port.

Off

No connection is established to the


port.

Activity
(Right)

Fibre
Link
Channel (for
UTA2/CNA
ports
configured in
Fibre
Channel
mode)
NVRAM

NV
Controller
attention

NVRAM Blinking
status
green

NVRAM destage/restage events are


occurring.

Solid green

NVRAM destage/restage events


completed successfully.

Off (Power
on)

The system is running normally, and


the NVRAM is armed if Data
ONTAP is running.

Attention Amber

A problem has occurred in the


controller. This in turn has caused
the system attention LED on the
front of the chassis to be
illuminated.
Note: The LED might be
illuminated on both controllers.

Off

The controller is functioning


properly.

Storage system LEDs | 95


The following illustration shows the location of ports and LEDs on the right side of the back of the
controller:

e0i

e0j

e0k

e0l

1000Base-T port LEDs

1000Base-T ports

Remote management port LEDs

Remote management 10/100/1000Base-T port

Private management port LEDs

Private management 10/100Base-T port

USB port (External USB devices not currently supported)

96 | Hardware Platform Monitoring Guide


Serial port

The following table describes the meaning of the LEDs on the right of the back of the controller:
LED label

LED name

LED type

Status
indicator

Description

GbE

Link (Left)

Green

A link is established between the


port and some upstream device.

Off

No link is established.

Amber

Traffic is flowing over the


connection.

Off

No traffic is flowing over the


connection.

Green

A link is established between the


port and some upstream device.

Off

No link is established.

Amber

Traffic is flowing over the


connection.

Off

No traffic is flowing over the


connection.

Green

A link is established between the


port and a downstream disk shelf.

Off

No link is established.

Amber

Traffic is flowing over the


connection.

Off

No traffic is flowing over the


connection.

Port number
and
Activity
(Right)

Remote
management

Link (Left)

and
Activity
(Right)

Private
management
and

Link (Left)

Activity
(Right)

Location and meaning of LEDs on the back of FAS80xx I/O expansion


modules
You can check the back of the I/O expansion module to detect whether a fault has occurred.
The following illustration shows the ports and LEDs on the back of a FAS80xx I/O expansion
module:

Storage system LEDs | 97

PCIe slots

Attention LED

HA interconnect ports

HA interconnect port activity LEDs

HA interconnect port link LEDs

The following table describes the meaning of the LEDs on the I/O expansion module:
Label

Name

Type

Status
indicator

Description

I/O expansion
Attention
module attention

Amber

A fault has occurred.

Off

The I/O expansion module is


functioning normally.

HA interconnect Activity
port activity

Amber

Traffic is flowing over the


connection.

Off

No traffic is flowing over the


connection.

98 | Hardware Platform Monitoring Guide


Label

Name

Type

HA interconnect Link status


link

Status
indicator

Description

Green

A link is established with the


partner node.

Off

No link is established.

Location and meaning of FAS8020 fan LEDs


You can check the LED on each fan module to pinpoint problems that can occur in the FRU.
When the bezel is removed, the fan module FRUs and their LEDs are visible. The following
illustration shows the LED on a fan module FRU:
1

LED

The fan module FRU LED is amber and illuminates when a problem occurs in the fan. If you see
error messages indicating a fan problem, you can remove the bezel and use the illuminated fan FRU
LED to locate the FRU where the problem occurred.

Location and meaning of FAS8040, FAS8060, and FAS8080 fan LEDs


You can check the LED on each fan module FRU to pinpoint problems that can occur in the FRU.
When the bezel is removed, the fan module FRUs and their LEDs are visible. The following
illustration shows the LED on a fan module FRU:

Storage system LEDs | 99

LED

The fan module FRU LED is amber and illuminates when a problem occurs in the fan. If you see
error messages indicating a fan problem, you can remove the bezel and use the illuminated fan FRU
LED to locate the FRU where the problem occurred.

Location and meaning of FAS80xx power supply LEDs


You can check the LEDs on each PSU to see whether its power is on and whether the PSU is
working properly.
The following illustration shows the location of PSU LEDs on the back of the system:

100 | Hardware Platform Monitoring Guide

Attention LED

Power LED

The following table describes what the PSU LEDs mean:


Power LED status

Attention LED
status

PSU condition

Green

Off

The PSU is present and switched on; this is normal


mode.

Off

Off

The PSU is switched off.

Off

Blinking amber

This indicates a PSU fault; the AC in is out of range, or


there is a fault in either the DC or fan.

Location and meaning of FAS8020 internal FRU LEDs


FAS8020 systems contain LEDs near the FRUs inside the controller. These LEDs can provide
helpful information when troubleshooting any problems with the FRUs.
The following FRU LEDs are found inside the controller:

DIMMs (4)
USB boot device
PCIe slots (2)

Storage system LEDs | 101

RTC battery
NVRAM battery

FRU LEDs are off when the FRU is functioning normally and turn amber when a problem occurs.
They remain illuminated for at least 10 minutes even after you remove the controller from the
chassis.

Location and meaning of FAS8040, FAS8060, and FAS8080 internal FRU


LEDs
FAS8040, FAS8060, and FAS8080 systems contain LEDs near the FRUs inside the controller. These
LEDs can provide helpful information when troubleshooting any problems with the FRUs.
The following FRU LEDs are found inside the controller:

DIMMs (5 for FAS8040, 9 for FAS8060 and FAS8080)


USB boot device
PCIe slots (4)
RTC battery
NVRAM battery

FRU LEDs are off when the FRU is functioning normally and turn amber when a problem occurs.
They remain illuminated for at least 10 minutes even after you remove the controller from the
chassis.

NVRAM adapter LEDs


NVRAM adapter LEDs enable you to determine whether an NVRAM is holding unwritten data and,
in HA pair, to check the connection between the two nodes.
The NVRAM preserves unwritten data if your system loses power. The NVRAM also is the HA
interconnect when your system is in an HA pair, except when you use MetroCluster.
Different systems have different kinds of NVRAM adapters. NVRAM6 and NVRAM8 adapters plug
into the motherboard. NVRAM7 and NVRAM9 adapters are integrated into the motherboard. The
following table shows the type of NVRAM that different systems support:
NVRAM type

Systems

NVRAM6

NVRAM7

31xx

NVRAM8

62xx

NVRAM9

FAS80xx

3040, 3070, and SA300


60xx and SA600

102 | Hardware Platform Monitoring Guide

Location and meaning of NVRAM5 and NVRAM6 LEDs


You can check the LEDs to learn whether there is valid data in the NVRAM when your system loses
power. When you use the NVRAM adapter as an HA interconnect, you also can check the LEDs to
learn whether there is a connection between the nodes.
Two sets of LEDs by each port on the faceplate operate when you use the NVRAM5 or NVRAM6
adapter as an HA interconnect. NVRAM adapters also have an internal LED that you can see through
the faceplate. The following illustration shows LEDs on the NVRAM5 and NVRAM6 adapter:

L01 PH1

L02 PH2

NVRAM5

The following table explains what the LEDs on an NVRAM5 or NVRAM6 adapter mean:
LED type

Indicator

Status

Description

Internal

Red

Blinking

There is valid data in NVRAM.


Note: The LED might blink red if your system did
not shut down properly, as in the case of a power
failure or panic. The data is replayed when the
system boots again.

PH1

LO1

Green

Yellow

On

The physical connection is working.

Off

No physical connection exists.

On

The logical connection is working.

Off

No logical connection exists.

Storage system LEDs | 103

Location and meaning of NVRAM5 and NVRAM6 media converter LEDs


You can check the LED to learn whether the media converter has power, whether a link is present,
and whether the converter is operating normally.
The following illustration shows the location of the LED on NVRAM5 and NVRAM6 media
converters:
1

LED

Media converter

The following table explains what the LED on NVRAM5 and NVRAM6 media converters means:
Indicator

Status

Description

Green

On

The media converter is operating normally.

Green/amber

On

Power is present but link is down.

Green

Flickering or off

Power is present but link is down.

Location and meaning of NVRAM7 LEDs


You can check the LEDs to learn if there is any unwritten data in the NVRAM if your controller
loses power.
Each 31xx controller has two NVRAM7 LEDs:

One LED is near the left front corner of the motherboard next to the NVRAM DIMM.
This LED is labeled D35 and NVRAM Data Valid When Lit. You can see the LED only after you
remove the controller from the chassis.

104 | Hardware Platform Monitoring Guide

One LED is near the right rear corner of the motherboard. It is labeled D87.
You can see this LED through the rear grille of the controller as shown in the following
illustration:

NVRAM7 LED

Attention: NVRAM7 LEDs flash red if unwritten data is being held in the NVRAM when power

to the controller is turned off. If you remove the NVRAM7 battery or NVRAM7 DIMM when the
red LEDs are flashing, you lose data that is being held in the NVRAM.
Note: In an HA pair, each node continually monitors its partner and mirrors its partner's NVRAM
data. Therefore, if you remove a controller from a 31xx system in an HA pair without first shutting
it down, you can disregard the illuminated NVRAM LEDs on the motherboard of the removed
controller.

Location and meaning of NVRAM8 LEDs


You can check the LEDs on the NVRAM8 adapter to verify the connection between controllers in an
HA pair and to learn the status of data when the system loses power.
Five LEDs are on the faceplate and one LED on the adapter board is visible through the faceplate
grille.
The following illustration shows the LEDs on the NVRAM8 adapter:

Storage system LEDs | 105

LNK ACT

3
4
INT LNK

LNK

ACT

InfiniBand port 0 link LED

InfiniBand port 0 activity LED

InfiniBand port 0 connector

Internal link select LED

InfiniBand port 1 link LED

InfiniBand port 1 activity LED

InfiniBand port 1 connector

Port 0 link and activity LEDs are relevant when port 0 of the controller is connected to a partner in an
HA pair. The following table explains the meaning of the port 0 LEDs:

106 | Hardware Platform Monitoring Guide


LED
name

Status indicator

Description

Port 0 link

Green

A physical connection is working on the port 0 connector.

Off

A physical connection is not working on the port 0 connector.

Amber

A logical connection is working on the port 0 connector.

Off

A logical connection is not working on the port 0 connector.

Port 0
activity

Port 1 LEDs reflect the state of the port 1 connector used between two controllers installed in
different chassis or the state of the internal InfiniBand connection used between two controllers
installed in the same chassis. The following table explains the meaning of the port 1 LEDs:
LED
name

Status
indicator

On (internal midplane
connection)

Off (external cable


connection)

An internal physical connection is


working over the midplane.

An external physical
connection is working on the
port 1 connector.

Off

An internal physical connection is


not working over the midplane.

An external physical
connection is not working on
the port 1 connector.

Amber

An internal logical connection is


working over the midplane.

An external logical connection


is working on the port 1
connector.

Off

An internal logical connection is not An external logical connection


working over the midplane.
is not working on the port 1
connector.

Port 1 link Green

Port 1
activity

Internal link select LED status

Port 1 LEDs depend on the state of the Internal link select LED, which in HA pair configurations
depends on how the controllers are connected. The following table explains the meaning of the
internal link select LED:
LED name

Status indicator Description

Internal link
select

Green

The HA pair consists of two controllers in the same chassis


connected over the internal midplane.

Off

The HA pair consists of two controllers in different chassis


connected by an external cable.

Storage system LEDs | 107


A destage status LED, located on the top of the adapter board, is visible through the grille of the
faceplate halfway between the top of the faceplate and the InfiniBand port 0 LEDs. The LED shows
the status of NVRAM8 data after an unexpected loss of system power.
Data might need to be destaged, or saved from active DRAM to nonvolatile flash memory after an
unexpected power loss. Destaging lasts about one minute. Once data has been destaged, it must be
restaged, or restored from nonvolatile flash memory to active DRAM during system initialization.
The destage LED might be lit as red or green. Its behavior depends on whether the system power is
on or off. When the system power is off, the LED behavior depends on whether the NVRAM8
adapter is running on battery power. The battery automatically turns off after data is destaged.
The following table explains the meaning of the destage status LED when the NVRAM8 adapter is in
the controller:
Destage LED
status indicator

System power on

Red

Green

System power off


Battery power on

Battery power off

The NVRAM8 adapter has


destage data that needs to
be restored.

Invalid

Not applicable

The NVRAM8 adapter has


restored data and is ready
for the next destage.

Invalid

Not applicable

Alternating red and Invalid


green

The NVRAM8 adapter Not applicable


is destaging data.

Off

Invalid

Invalid

The NVRAM8 adapter


has finished destaging
data.

You can use the destage status LED when the adapter is removed from the system to determine
whether destage data is in the NVRAM8 adapter.
The following illustration shows the location of the destage status LED:

108 | Hardware Platform Monitoring Guide

Destage status LED

InfiniBand port 0 LEDs

You activate the destage status LED when the NVRAM8 adapter is removed from the controller by
pressing and holding the button marked both SW6 and STATUS on the bottom of the adapter board.
The following illustration shows the location of the button:

Storage system LEDs | 109

STATUS
SW6

STATUS
SW6

Button for activating destage status LED

The LED can light up red or green and also show both colors simultaneously, creating a light that
appears amber. The following table explains the meaning of the destage status LED when the button
is pressed:
LED color

Description

None (Off)

No status; no battery power.

Amber

Miscellaneous status for debugging

Green

No data in flash memory; not destaged.

Red

Data in flash memory; destaged.

Location and meaning of NVRAM9 LEDs


You can check the LEDs to learn if there is any unwritten data in the NVRAM if your controller
loses power.
Each FAS80xx controller has three NVRAM9 LEDs:

110 | Hardware Platform Monitoring Guide

The external NVRAM9 status LED is found on the rear face of the chassis, as seen in the
following illustration:
This LED is labeled NV, as shown in the following illustration:

The NVRAM9 DIMM attention LED is found behind the DIMM and near the edge of the
NVRAM9 adapter. This LED is labeled FRU LED5.
The internal NVRAM9 status LED is found on the corner of the NVRAM9 adapter. This LED is
labeled Destage LED3.
You can see LED3 and LED5 only after you remove the controller from the chassis, as shown in
the following illustration:

NVRAM9 status LED (external)

NVRAM9 DIMM attention LED (internal)

NVRAM9 status LED (internal)

The following table describes the meaning of the NVRAM9 LEDs:

Storage system LEDs | 111


NVRAM9 LED

Status
indicator

Description

NVRAM9 status

Flashing green

NVRAM destage/restage events are occurring.

Solid green

NVRAM destage/restage events completed


successfully.

Off (power on)

The system is running normally, and the NVRAM is


armed if Data ONTAP is running.

Solid amber

An error has occurred on the NVRAM9 DIMM.

NVRAM9 DIMM
attention

Attention: If you remove the NVRAM9 battery or the NVRAM9 DIMM when the green LEDs are

flashing, you lose data that is being held in the NVRAM. The NVRAM9 battery can be removed
after the destage is completed without loss of data.
Note: In an HA pair, each node continually monitors its partner and mirrors its partner's NVRAM
data. If you remove a controller from an FAS80xx system in an HA pair without first shutting it
down, you can disregard the illuminated NVRAM LEDs on the motherboard of the removed
controller.

112 | Hardware Platform Monitoring Guide

Adapter card LEDs


Adapter card LEDs enable you to monitor your storage system and its components.
Your storage systems might have adapters installed and configured on them. These adapters include
LEDs, which show you whether the adapter has power, whether there is a network connection, and
whether data is being transmitted.

Converged network adapter (CNA) and unified target


adapter (UTA/UTA2) LEDs
CNA and UTA/UTA2 adapters have LEDs that you can check to learn whether the adapter has
power, whether a link is established, or whether an error has occurred.
The terms converged network adapter and unified target adapter are synonyms. Depending on the
model and how it is configured, the adapter can provide Fibre Channel, iSCSI, or FCoE/Ethernet
capability.

Location and meaning of dual-port, 10-Gb, FCoE CNA HBA LEDs


You can check the LEDs on the HBA to learn about SAN or LAN traffic over the HBA and the
status of the HBA and the connection.
The following illustration shows the location of LEDs on a dual-port, 10-Gb, FCoE (Fibre Channel
over Ethernet) HBA:

2
3

5
6

Adapter card LEDs | 113

One of two LAN LEDs

One of two SAN LEDs

Port a

Port b

One of two transmitter ports

One of two receiver ports

The ports in the preceding illustration are labeled a and b because Data ONTAP identifies ports
alphabetically. The physical ports are labeled Port 1 for Port a and Port 2 for Port b.
Note: These HBAs are supported only in target mode and single system image controller failover

cfmode. You cannot use this HBA as an initiator to connect to disks or tape, and you cannot use it
for fabric MetroCluster interconnect configurations.
The following table explains what the LEDs on a dual-port, 10-GB, FCoE HBA mean:

114 | Hardware Platform Monitoring Guide


Port

SAN traffic green LED

LAN traffic green LED

Hardware state

Off

Off

Power is off

Slow flashing (unison)

Slow flashing (unison)

Power is on/no link

On

On

Power is on/link established,


no activity

On

Flashing

Power is on/link established,


Rx/Tx Ethernet activity only

Flashing

On

Power is on/link established,


Rx/Tx storage activity only

Flashing

Flashing

Power is on/link established,


Rx/Tx Ethernet and storage
activity

Slow flashing, alternating Slow flashing, alternating


with other LED
with other LED

Beaconing

Off

Off

Power is off

Slow flashing (unison)

Slow flashing (unison)

Power is on/no link

On

On

Power is on/link established,


no activity

On

Flashing

Power is on/link established,


Rx/Tx Ethernet activity only

Flashing

On

Power is on/link established,


Rx/Tx storage activity only

Flashing

Flashing

Power is on/link established,


Rx/Tx Ethernet and storage
activity

Slow flashing, alternating Slow flashing, alternating


with other LED
with other LED

Beaconing

Adapter card LEDs | 115

Location and meaning of dual-port, 16-Gb FC, 10-GbE/FCoE UTA2 LEDs


You can check the LEDs on the UTA2 to learn about SAN or LAN traffic over the adapter and the
status of the adapter and its connections.

PCIe x8
16G FC/
10GbE

PORT 2
>

1
>
PORT 1

Ports a and b, respectively

LEDs 0, 1, and 2, respectively; one for each port

The ports in the preceding illustration are labeled a and b because Data ONTAP identifies ports
alphabetically. The physical ports are labeled Port 1 for Port a and Port 2 for Port b. Port a is shown
empty, as used with SFP+ copper cables. Port b is shown with an SFP+ optical module installed, as
used with LC fiber cables.
You must connect to the UTA2 ports using LC fiber-optic cables with supported SFP+ optical
modules, or by using supported copper SFP+ cables (in 10-GbE mode only).

116 | Hardware Platform Monitoring Guide


Note: Both ports must operate in the same mode (FC or 10-GbE).

The LEDs on the UTA2 provide information about traffic to the ports as well as information about
their status and their connections. The LED color and activity vary depending on the mode in which
the ports are configured as well as the ports' current status.
LED
LED
legend ID

Fibre Channel mode

FCoE/Ethernet mode

16-Gbps
link up/
activity

8-Gbps
link up/
activity

4-Gbps
link up/
activity

Power
on, no
link

Power on,
10-Gbps
link up, no
activity

Power on, link


up, Ethernet
Tx/Rx activity

LED
0

Off

Off

Flashing
amber

Green

Green

Green

LED
1

Off

Flashing
green

Off

Off

Green

Flashing green

LED
2

Flashing
amber

Off

Off

Off

Green

Green

Note: By default, the UTA2 ships configured in Fibre Channel, target mode. To change the
personality and operational mode of the card, you must use the ucadmin command for systems
running Data ONTAP in 7-Mode, and the system node hardware unified-connect
command for systems running clustered Data ONTAP. See the man pages for details.

Ethernet NIC LEDs


Ethernet NICs have LEDs that you can check to learn the status of the Ethernet connection and, in
some cases, network transfer speeds.
The NICs in your system may be fiber optic-based or copper-based. They may have one, two, or four
ports.

Location and meaning of single-port GbE NIC LEDs


You can check the LEDs on your single-port copper or fiber GbE NIC to learn whether there is a
network connection and whether there is data activity. On copper GbE NICs, you also can learn how
fast data is being transmitted.
The following illustration shows the location of LEDs on copper and fiber single-port GbE NICs:

Adapter card LEDs | 117

Copper 10Base-T/100Base-BX/1000Base-T NIC

Fiber 1000Base-SX NIC

The following table explains what the LEDs on single-port copper GbE NICs mean:
LED type

Status indicator

Description

ACT/LNK

Green

Valid network connection established

Blinking green or blinking amber Data activity


Off

No network connection

10=OFF

Off

Data transmits at 10 Mbps

100=GRN

Green

Data transmits at 100 Mbps

1000=YLW

Yellow

Data transmits at 1000 Mbps

The following table explains what the LEDs on single-port fiber GbE NICs mean:
LED type

Status indicator

Description

LNK

On

Valid network connection


established

Off

No network connection

118 | Hardware Platform Monitoring Guide


LED type

Status indicator

Description

ACT

On

Data activity

Off

No network activity

Location and meaning of multiport GbE NIC LEDs


You can check the LEDs on your multiport copper or fiber GbE NIC to learn whether there is a
network connection and whether there is data activity. On copper GbE NICs, you also can learn how
fast data is being transmitted.
The following illustration shows the location of LEDs on copper and fibre dual-port GbE NICs:

Copper 10Base-T/100Base-TX/1000Base-T NIC

Fiber 1000Base-SX NIC

Network speed LEDs

The following illustration shows the location of LEDs on copper quad-port GbE NICs:

Adapter card LEDs | 119


1

5
4

6
5

Note: The orientation of the ports on NICs might differ.

ACT LED

LNK LED

Port a

Port b

Port c

Port d

The following table explains what the LEDs on a copper multiport GbE NIC mean:

120 | Hardware Platform Monitoring Guide


LED type

Status indicator

Description

ACT

Green

A valid network connection is established.

Blinking green or blinking amber There is data activity.

LNK

Off

There is no network connection.

Off

Data transmits at 10 Mbps.

Green

Data transmits at 100 Mbps.

Amber

Data transmits at 1000 Mbps.

The following table explains what the LEDs on the fiber multiport GbE NICs might indicate:
LED type

Status indicator

Description

LNK

On

A valid network connection is established.

Off

There is no network connection.

On

There is data activity.

Off

There is no network activity present.

ACT

The following illustration shows the location of LEDs on optical quad-port GbE NICs:

Adapter card LEDs | 121

1
GRN=1G

GRN=1G

5
2
GRN=1G

3
GRN=1G

4
GRN=1G

Note: The ports might not be labeled. For convenience, they are identified in the following table as
ports a, b, c, and d:

Ports a, b, c, and d, respectively

Status LED; one for each port

The following table explains what the LEDs on optical multiport GbE NICs might indicate:
Status indicator

Description

Off

No link

Green

Adapter is connected to a valid link partner

Blinking green

Data activity

122 | Hardware Platform Monitoring Guide

Location and meaning of LEDs on the dual-port 10-GbE NIC that supports
fiber optic cables with SFP+ modules or copper SFP+ cables
You can check the LEDs on your dual-port 10-GbE NIC that supports fiber optic cables and SFP +
optical modules or copper SFP + cables to learn whether there is a network connection and whether
there is data activity.
The following illustration shows the location of LEDs and ports on the NIC:

1
3
5
4
2

LINK/ACT LED for port a

LINK/ACT LED for port b

Port a

Port b

SFP module latches

Adapter card LEDs | 123


The following table explains what the LEDs on the NIC mean:
LED label

Status indicator

Description

LINK/ACT

Green

Valid network connection established

Blinking amber

Data activity

Off

No network connection

Location and meaning of LEDs on the dual-port 10-GbE NIC that supports
fiber optic cables with X6569 SFP+ modules or copper SFP+ cables
You can check the LEDs on your dual-port 10-GbE NIC that supports fiber optic cables and X6569
SFP+ optical modules or copper SFP+ cables to learn whether there is a network connection, whether
there is data activity, and whether the card is operating at 10-Gb speed.
The following illustration shows the location of LEDs and ports on the NIC:

3
1
2

GRN=10G
ACT/LNK A

4
5
6

GRN=10G
ACT/LNK A

124 | Hardware Platform Monitoring Guide

Port a 10-Gb link LED

Port a ACT/Link LED

Port a with SFP+ installed

Port b with no SFP+ connector

Port b 10-Gb link LED

Port b ACT/Link

The following table explains what the LEDs on the card mean:
LED label

Status indicator

Description

GRN=10G

Green

NIC is operating at 10 Gbps

LINK/ACT

Green

Valid network connection established

Blinking amber

Data activity

Off

No network connection

Location and meaning of single-port, 10-GbE NIC LEDs (2050 systems only)
You can check the LEDs on your single-port, 10-GbE NIC to learn whether there is a network
connection and whether there is data activity. This NIC is used only in 2050 systems.
The following illustration shows the location of LEDs on the single-port, 10-GbE NIC:

Adapter card LEDs | 125

LINK/ACT LED

Port a

The following table explains what the LEDs on the single-port, 10-Gb NIC mean:
LED label

Status indicator

Description

LINK/ACT

Green

Valid network connection established

Blinking amber

Data activity

Off

No network connection

Location and meaning of dual-port 10-GbE RJ45 NIC LEDs


You can check the LEDs on your dual-port, 10-GbE NIC that supports RJ45 cables to learn whether
there is a network connection, whether there is data activity, and to determine the speed at which it is
operating.
The following illustration shows the location of LEDs and ports on the NIC:

126 | Hardware Platform Monitoring Guide

10G=GRN
1G=YLW
100M=OFF

3
1
4

2
ACT/LNK

Port a

Port b

Speed LED (one per port)

Activity/Link LED (one per port)

The following table explains what the LEDs on the card mean:
LED position and
function

Status indicator

Description

Top (speed)

Green

NIC is operating at 10 Gbps

Amber

NIC is operating at 1 Gbps

Off

NIC is operating at 100 Mbps (not supported)

Adapter card LEDs | 127


LED position and
function

Status indicator

Description

Bottom (activity/
link)

Green

Valid network connection established

Flashing green

Data activity

Off

No network connection

The following table shows how the type and length of Ethernet cable used with the card determine
the speed at which it can perform.
Note: The color displayed by the card's speed LEDs is not affected by Ethernet cable type and
length.

Ethernet cable
type

Maximum length for 10GBASE-T


support

Maximum length for 1000BASE-T


support

Category 6a

100 meters

100 meters

Category 6

55 meters

100 meters

Category 5e

Not supported

100 meters

Flash Cache module and PAM LEDs


Flash Cache modules and Performance Acceleration Modules (PAMs) have LEDs that you can check
to ensure that the card has power or to learn about its performance.
Flash Cache modules are available in capacities of 256 GB, 512 GB, 1 TB, and 2 TB. PAMs have a
capacity of 16 GB.
This document uses the term Flash Cache module to refer to caching modules with capacities greater
than 16 GB. Before the release of Data ONTAP 7.3.5, such adapters were called Performance
Acceleration Modules (PAM II). The name of the 16-GB caching module remains Performance
Acceleration Module (PAM I).

Location and meaning of PAM LEDs


The PAM has two LEDs, both visible through the perforations of the PCIe bracket. You can check
the LEDs to ensure that the module is in place and has power.
The position of the LEDs relative to the system depends on the model of the system it is installed in.
Different systems can have horizontal or vertical expansion slots.
The following table describes the behavior of the module LEDs.

128 | Hardware Platform Monitoring Guide


LED

Description

Green

Power ready indicator: Replace the card if the LED is off

Blinking blue

Indicates the presence of the card; LED dims slightly on heavy loads
Replace the card if it does not blink after you boot Data ONTAP.

Location and meaning of Flash Cache module LEDs


Each Flash Cache module has two LEDs, which you can check to see if the module is operating
properly and to view its performance.
The illustration shows the LEDs on a module.

1
2

The following table explains what the LEDs on the module mean:

Fault

Adapter card LEDs | 129

Activity

LED type

Status indicator

Description

Fault

Solid amber

A fault has occurred.

Activity

Blinking green

There is activity on the card. The LED blinks once


every two seconds when the card is idle and increases
the blink rate as its performance increases up to 10
times per second.

HBA LEDs
HBAs have LEDs that you can check to learn whether the adapter has power, whether a link is
established, and whether an error has occurred.
Storage systems might have Fibre Channel or iSCSI host bus adapters installed and configured on
them.

Location and meaning dual-port Fibre Channel HBA LEDs


You can check the LEDs on the HBA to learn the status of the Fibre Channel connection.
The following illustration shows the location of the LED on a dual-port Fibre Channel HBA:

Green LED

130 | Hardware Platform Monitoring Guide

Amber LED

The following table explains what the LEDs on a dual-port Fibre Channel HBA mean:
Green

Amber

Description

On

On

Power is on

Off

Blinking

Sync is lost

Off

On

Signal acquired

On

Off

Ready

Blinking

Off

4 seconds solid followed by one flash: 1-Gb link speed


4 seconds solid green link followed by two flashes: 2-Gb
link speed

Blinking

Blinking

Adapter firmware error detected

Location and meaning of dual-port, 4-Gb or 8-Gb, target-mode Fibre


Channel HBA LEDs
You can check the LEDs to learn whether the HBA power is on, whether a firmware error has been
detected, and whether a link has been established.
The following illustration shows the location of LEDs on a dual-port, 4-Gb or 8-Gb, target-mode
Fibre Channel HBA:

Adapter card LEDs | 131

Amber

Green

Yellow

Port a

Port b

Yellow

Green

Amber

TX

10

RX

11

TX

12

RX

The following table explains what the LEDs mean:


Yellow

Green

Amber

Description

Off

Off

Off

Power is off

On

On

On

Power is on (before firmware initialization)

Blinking
Blinking alternately

Power is on (after firmware initialization)


Adapter firmware error detected

132 | Hardware Platform Monitoring Guide


Yellow

Green

Amber

Off

Off

On/
4-Gb HBA: 1 Gbps link/I/O established
Blinking 8-Gb HBA: On for 2 Gbps link up; LED blinks several times per
second during I/O activity

Off

On/
Off
Blinking

4-Gb HBA: 2 Gbps link/I/O established


8-Gb HBA: On for 4 Gbps link up; LED blinks several times per
second during I/O activity

On/
blinking

Off

4-Gb HBA: 4 Gbps link/I/O established


8-Gb HBA: On for 8 Gbps link up; LED blinks several times per
second during I/O activity

Blinking Off

Off

Description

Blinking Beacon

Location and meaning of quad-port, 4-Gb, Fibre Channel HBA LEDs: fourLED version
You can check the LEDs on the HBA to learn the status of the storage system Fibre Channel link and
whether data is being transferred.
The following illustration shows the location of LEDs:

Port a (as identified by Data ONTAP)

Port b (as identified by Data ONTAP)

Adapter card LEDs | 133

Port c (as identified by Data ONTAP)

Port d (as identified by Data ONTAP)

Port a LED

Port c LED

Port b LED

Port d LED

The following table describes what the LEDs mean:


LED label

Status indicator

By port letter White

Description
Loss of sync or no link

Blinking white

Adapter fault detected

Amber

1 Gbps link established

Blinking amber

1 Gbps data transfer activity

Green

2 Gbps link established

Blinking green

2 Gbps data transfer activity

Blue

4 Gbps link established

Blinking blue

4 Gbps data transfer activity

Location and meaning of quad-port, 4-Gb, Fibre Channel HBA LEDs: 12LED version
You can check the LEDs on the HBA to learn the status of the Fibre Channel connection and whether
data is being transferred.
The following illustration shows the location of LEDs:

134 | Hardware Platform Monitoring Guide

Port a (as identified in Data ONTAP)

Port b (as identified in Data ONTAP)

Port c (as identified in Data ONTAP)

Port d (as identified in Data ONTAP)

Ports a through d yellow LEDs

Ports a though d green LEDs

Ports a through d amber LEDs

The following table describes what the LEDs mean:


Yellow LEDs Green LEDs

Amber LEDs Description

Off

Power is off

On

Power is on (before firmware initialization)

Adapter card LEDs | 135


Yellow LEDs Green LEDs

Amber LEDs Description

Blinking

Power is on (after firmware initialization)

Blinking alternately

Firmware error detected

Off

Off

On

1 Gbps link established

Off

Off

Blinking

1 Gbps data transfer activity

Off

On

Off

2 Gbps link established

Off

Blinking

Off

2 Gbps data transfer activity

On

Off

Off

4 Gbps link established

Blinking

Off

Off

4 Gbps data transfer activity

Location and meaning of quad-port, 8-Gb, Fibre Channel HBA LEDs: 12LED version
You can check the LEDs on the HBA to learn the status of the Fibre Channel connection and whether
data is being transferred.
The following illustration shows the location of LEDs:

136 | Hardware Platform Monitoring Guide

5
1

6
5

6
5

6
5

9
8

Port a

Port b

Port c

Port d

TX (Transmit)

RX (Receive)

Adapter card LEDs | 137

Yellow LED

Green LED

Amber LED

The following table describes what the LEDs mean:


Yellow
LEDs

Green LEDs

Amber LEDs Description

Off

The power is off

On

The power is on (before firmware initialization)

Blinking

The power is on (after firmware initialization)

Blinking alternately

Adapter firmware error detected

Off

Off

On/Blinking

Online, 2 Gbps link/I/O activity

Off

On/Blinking

Off

Online, 4 Gbps link/I/O activity

On/
Blinking

Off

Off

Online, 8 Gbps link/I/O activity

Location and meaning of fiber-optic iSCSI target HBA LEDs


You can check the LEDs on the HBA to learn whether the HBA is on, whether it is connected to the
network, and whether there is data activity.
The following illustration shows the location of LEDs on a fiber optic, iSCSI, target HBA:

138 | Hardware Platform Monitoring Guide

LINK LED

ACT LED

Port 2

Port 1

The following table explains what the LEDs on a fiber optic, iSCSI, target HBA mean:
LED label

Status indicator

Description

LINK

Yellow

HBA is on and connected to the network

Off

HBA is not connected to the network

Green

Valid connection established

Blinking green

Data activity

ACT

Location and meaning of copper iSCSI target HBA LEDs


You can check the HBA LEDs to learn whether the HBA is operating at 1 Gbps, whether a
connection is established, and whether there is data activity.
The following illustration shows the location of LEDs on a copper iSCSI target HBA:

Adapter card LEDs | 139

Speed LED

ACT LED

Port 2

Port 1

The following table explains what the LEDs on a copper iSCSI target HBA mean:
LED label

Status indicator

Description

Speed

Green

HBA is operating at 1 Gbps

Off

HBA is not operating at 1 Gbps

Amber

Valid connection established

Blinking amber

Data activity

ACT

Location of dual-port, 3-Gb SAS HBA ports


Dual-port, 3-Gb SAS HBAs do not have LEDs that you can monitor.
The following illustration shows the location of ports on a dual-port 3-Gb SAS HBA and its cable:

140 | Hardware Platform Monitoring Guide

1
3

Port a

Port b

QSFP-to-Mini-SAS copper cableMini-SAS connector (to card)

QSFP-to-Mini-SAS copper cableQSFP connector (to disk shelf)

Location of quad-port, 3-Gb SAS HBA ports


Quad-port, 3-Gb SAS HBAs do not have LEDs that you can monitor.
The following illustration shows the location of ports on a quad-port, 3-Gb SAS HBA port and its
cable:

Adapter card LEDs | 141

5
1
2
3
4

Port a

Port b

Port c

Port d

SAS QSFP-to-QSFP copper cable

142 | Hardware Platform Monitoring Guide

MetroCluster (FCVI) adapter LEDs


MetroCluster adapters have LEDs that you can check to learn whether the adapter has power and
whether an error has occurred. MetroCluster adapters are also known as Fibre Channel Virtual
Interface (FCVI) adapters.

Location and meaning of dual-port, 2-Gb MetroCluster adapter LEDs


You can check the LEDs on the adapter to learn whether the power is on, whether a signal has been
acquired, or whether an error has occurred.
The following illustration shows the location of LEDs on a dual-port 2-Gb MetroCluster adapter:

One of two amber LEDs

One of two green LEDs

Port a

Port b

Adapter card LEDs | 143

One of two transmitter ports

One of two receiver ports

The following table explains what the LEDs mean:


Green

Amber

Description

Off

Off

Power is off

On

On

Power is on

Off

Blinking at half-second intervals

Synchronization is lost

Off

On

Signal is acquired

On

Off

Adapter is online

Blinking at half-second intervals

System error has occurred

Location and meaning of dual-port, 4-Gb MetroCluster adapter LEDs


You can check the LEDs on the adapter to learn whether power is on, whether there is activity, or
whether an error has occurred.
The following illustration shows the LEDs on the dual-port, 4-Gb MetroCluster adapter:

144 | Hardware Platform Monitoring Guide

Amber LED

Green LED

Yellow LED

Port a

Port b

Yellow LED

Green LED

Amber LED

Transmitter port

10

Receiver port

11

Transmitter port

12

Receiver port

The following table describes what the LEDs mean:


Yellow

Green

Amber

Description

Off

Off

Off

Power is off

On

On

On

Power is on (before firmware initialization)

Blinking

Blinking

Blinking

Power is on (after firmware initialization)

Adapter card LEDs | 145


Yellow

Green

Amber

Description

Yellow, green, and amber LEDs blinking


alternately

Adapter firmware error detected

Off

Off

On/blinking

Online, 1 Gbps link/I/O activity

Off

On/blinking

Off

Online, 2 Gbps link/I/O activity

On/blinking

Off

Off

Online, 4 Gbps link/I/O activity

Location and meaning of dual-port, 8-Gb MetroCluster adapter LEDs


You can check the LEDs on the adapter to learn whether power is on, whether there is activity, or
whether an error has occurred.
The following illustration shows the LEDs on the dual-port, 8-Gb MetroCluster adapter:

1
2
3
6

7
6

7
3
2
1

Amber LED

Green LED

Yellow LED

Port a

146 | Hardware Platform Monitoring Guide

Port b

Transmitter port

Receiver port

The following table describes what the LEDs mean:


Yellow

Green

Amber

Description

Off

Off

Off

Power is off

On

On

On

Power is on (before
firmware initialization)

Blinking

Blinking

Blinking

Power is on (after
firmware initialization)

Yellow, green, and amber LEDs blinking alternately

Adapter firmware error


detected

Off

Off

On/blinking

Online, 2 Gbps
link/I/O activity

Off

On/blinking

On

Online, 4 Gbps
link/I/O activity

On/blinking

Off

Off

Online, 8 Gbps
link/I/O activity

Adapter card LEDs | 147

Location and meaning of dual-port, 16-Gb MetroCluster adapter LEDs


You can check the LEDs on the adapter to learn whether the power is on, what the connection speed
is, and whether there is data activity.

PCIe x8
16G FC/
10GbE
16G
FCVI

PORT 2
>

1
>
PORT 1

Ports a, and b, respectively

LEDs 0, 1, and 2, respectively; one for each port

The ports in the preceding illustration are labeled a and b because Data ONTAP identifies ports
alphabetically. The physical ports are labeled Port 1 for Port a and Port 2 for Port b. Port a is shown
empty, as used with SFP+ copper cables. Port b is shown with an SFP+ optical module installed, as
used with LC fiber cables.
You must connect to the adaptor ports using LC fiber-optic cables with supported SFP+ optical
modules, or by using supported copper SFP+ cables.

148 | Hardware Platform Monitoring Guide


The LEDs on the adapter provide information about traffic to the ports as well as information about
their status and their connections.
LED
legend

LED ID

16 Gbs link up/


activity

8 Gbs link up/activity 4 Gbs link up/


activity

LED 0

Off

Off

Flashing amber

LED 1

Off

Flashing green

Off

LED 2

Flashing amber

Off

Off

TCP offload engine (TOE) NIC LEDs


TOE NICs have LEDs that you can check to learn the state of the network connection.
TOE NICs might have one port or multiple ports.

Location and meaning of single-port TOE NIC LEDs


The single-port TCP offload engine is a 10GBASE-SR fiber optic NIC. You can check the NIC
LEDs to learn whether it is on, whether there is a network connection, or whether the operating
system has booted.
The following illustration shows the location of LEDs on the NIC:

Adapter card LEDs | 149

Fiber optic LC port

LINK LED

ACT LED

STAT (power) LED

The following table explains what the LEDs mean:


LED type

Status indicator

Description

ACT/LNK

Green

Valid network connection established

Blinking green

Data activity

Off

No network connection

Red

NIC is on and receiving power

Off

Operating system has booted

STAT

Location and meaning of quad-port TOE NIC LEDs


You can check the LEDs on the TOE NIC to learn whether there is data activity and to determine the
speed of data transmission.
The following illustration shows the location of LEDs on the TOE NIC:

150 | Hardware Platform Monitoring Guide

Activity LEDs: LED 1 corresponds to port a, LED 2 corresponds to port b, and so on.

Port a

Port b

Port c

Port d

Activity LEDs: LED 1 corresponds to port a, LED 2 corresponds to port b, and so on.

Port d

Port c

Port b

10

Port a

Adapter card LEDs | 151


The following table explains what the LEDs on the TOE NIC mean:
LED label

Status indicator

Description

Labeled by port
number

Yellow

Data transmits at 1 Gbps

Green

Data transmits at 10/100 Mbps

Blinking

Data activity

Location and meaning of dual-port, 10GBase-SR TOE NIC LEDs


You can check the LEDs on the TOE NIC to learn whether there is a network connection or data
activity.
The following illustration shows the location of LEDs on the TOE NIC:

LINK/ACT LED, port a

LINK/ACT LED, port b

Fiber optic LC, port a

Fiber optic LC, port b

The following table explains what the LEDs on the TOE NIC mean:

152 | Hardware Platform Monitoring Guide


LED label

Status indicator

Description

LINK/ACT

Green

Valid network connection established

Green

Data activity

Off

No network connection

Location and meaning of dual-port, 10GBase-CX4 TOE NIC LEDs


You can check the LEDs on the TOE NIC to learn whether there is a network connection or data
activity.
Note: The 10GBase-CX4 dual-port TOE NIC is for use only on systems running Data ONTAP

10.0.3 or later.
The following illustration shows the location of LEDs on the TOE NIC:

LINK/ACT LED a

Port a

LINK/ACT LED b

Port b

Adapter card LEDs | 153


The following table explains what the LEDs on the TOE NIC mean:
LED type

Status indicator

Description

LINK/ACT

Green

Valid network connection established

Blinking green

Data activity

Off

No network connection

154 | Hardware Platform Monitoring Guide

Startup messages
When you apply power to your system, it verifies the hardware that is in the system, loads the
operating system, and displays startup informational and error messages on the system console.
There are two types of startup error messages:

POST error messages


Boot error messages

Both error message types are displayed on the system console, and an e-mail notification is sent out
by the remote management subsystem, if it is configured to do so.

POST messages
POST is a series of tests run from the motherboard PROM. These tests check the hardware on the
motherboard and differ depending on your system configuration.
POST messages appear on the system console before Data ONTAP software is loaded.
The following text is an example of a POST message on the console on a system that uses the
LOADER boot environment. Systems using the CFE boot environment display similar messages.
Phoenix TrustedCore(tm) Server
Copyright 1985-2005 Phoenix Technologies Ltd. All Rights Reserved
Portions Copyright (c) 2005-2009 NetApp All Rights Reserved
BIOS Version: 1.7X9
CPU= Dual Core AMD Opteron(tm) Processor 885 X 4
Testing RAM.
512MB RAM tested
32768MB RAM installed
Fixed Disk 0: NACF1GBJU-A11
Boot Loader version 1.6.1X2
Copyright (C) 2000-2003 Broadcom Corporation.
Portions Copyright (C) 2002-2009 NetApp
CPU Type: Dual Core AMD Opteron(tm) Processor 885
Starting AUTOBOOT press Ctrl-C to abort...
Note: If your system has an LCD, it displays POST messages without a header.

Startup messages | 155

Boot messages
After the boot is successfully completed, your system loads the operating system. Messages provide
information about your system and alert you to errors that occur during boot.
Note: The exact boot messages that appear on your system console depend on your system

configuration.
The following message is an example of the start of a boot message that appears on the system
console of a FAS6030 storage system at first boot.
NetApp Release 7.3.1X19: Sat Nov 22 02:04:05 PST 2008
Copyright (C) 1992-2008 NetApp.
Starting boot on Wed Mar 25 00:51:31 GMT 2009
Wed Mar 25 00:52:13 GMT [diskown.isEnabled:info]: Software ownership has
been enabled ...
Wed Mar 25 00:51:17 GMT [fmmb.current.lock.disk:info]: Disk 0b17 is a
local HA mailbox disk
Wed Mar 25 00:51:17 GMT [fmmb.current.lock.disk:info]: Disk 0b16 is a
local HA mailbox disk
...
Wed Mar 25 00:51:17 GMT [cf.fm.partner:info]: Cluster monitor: partner
'node2'
...

FAS20xx and SA200 startup progress


FAS20xx and SA200 systems do not display POST error messages on the system console.
You can track BIOS and boot loader progress by watching a progress indicator on the system console
and by monitoring a sensor through the BMC.

Method of viewing progress on the console


You can view BIOS and boot loader progress by monitoring the progress indicator on your system
console.
The initial BIOS message appears on the console about five seconds after the system starts. After
that, and before the boot loader runs, continued POST progress is indicated by a line of dots (.) or
plus signs (+). These dots or plus signs follow the line showing the BIOS version, as shown in the
console output below:
AMI BIOS8 Modular BIOS
Copyright (C) 1985-2006, American Megatrends, Inc. All Rights
Reserved
Portions Copyright (C) 2006 Network Appliance, Inc. All Rights

156 | Hardware Platform Monitoring Guide


Reserved
BIOS Version 3.0
...................
Boot Loader version 1.3
Copyright (C) 2000,2001,2002,2003 Broadcom Corporation.
Portions Copyright (C) 2002-2005 Network Appliance Inc.
CPU Type: Mobile Intel(R) Celeron(R) CPU 2.20GHz
Starting AUTOBOOT press Ctrl-C to abort...

The dots or plus signs are a progress indicator to show that the BIOS is not hung. If the system
restarts after a fault, the dots are replaced by plus signs to indicate that the system NVMEM is armed,
or being protected, during the boot process.
The BIOS should begin loading Data ONTAP within about 25 seconds after the initial greeting.

Method of viewing progress through the BIOS Status sensor


The BMC monitors boot progress; you can determine the boot progress status through the BIOS
Status sensor by entering the sensors show BMC command.
The following text shows partial output of the BMC sensors show command:
bmc shell -> sensors show
name
State
ID
Reading
Crit-Low
Warn-Low
Warn-Hi
Crit-Hi
-----------------------------------------------------------------------------------1.1V
Normal
#77
1121 mV
95 mV
--1239 mV
1.2V
Normal
#76
1239 mV
1038 mV
--1357 mV
1.5V
Normal
#75
1522 mV
1309 mV
--1699 mV
1.8V
Normal
#74
1829 mV
1569 mV
--2029 mV
12.0V
Normal
#70
12080 mV 10160 mV
--13840 mV
2.5V
Normal
#73
2520 mV
2116 mV
--2870 mV
3.3V
Normal
#72
3374 mV
2808 mV
--3799 mV
BIOS Status Normal
#f0 Loader #20
----Batt 8.0V
Normal
#50
7552 mV
--8512 mV
8576 mV
Batt Amp
Normal
#59
0 mA
--2112 mA
2208 mA

In the sensors show output, the BIOS Status sensor displays one of three states: Normal, Hung, or
Error. In the Reading column, the sensor displays BIOS and boot loader progress. In the example
output, the BIOS Status sensor displays a state of Normal and a reading of Loader #20, indicating
that the boot loader is running normally.
The following table lists the BIOS and boot loader progress values.
Status

Description

0x00

System software has cleanly shut down. (Sent only by Data ONTAP.)

0x01

Memory initialization is in progress.

0x02

NVMEM initialization is in progress (when NVMEM is armed).

0x05

User has entered setup.

Startup messages | 157


Status

Description

0x13

Booting to Data ONTAP (or boot loader).

0x1F

BIOS is starting up. (Special message to the BMC.) This is the first BIOS status message.
It might be quickly followed by another.

0x20

Boot loader is running.

0x21

Boot loader is programming the primary firmware hub. The BMC does not allow the
system to be powered down at this time.

0x22

Boot loader is programming the alternate firmware hub. The BMC does not allow the
system to be powered down at this time.

0x2F

Boot loader has transferred control to Data ONTAP. Data ONTAP might send this
periodically to inform the BMC that Data ONTAP is running, if the BMC has rebooted.

0x60

BMC has shut power off.

0x61

BMC has turned power on.

0x62

BMC has reset the system.

0x63

BMC Watchdog power cycle.

0x64

BMC Watchdog cold reset.

The BIOS Status sensor also displays BIOS and boot loader error codes. If the BIOS status sensor
displays a Hung or Error state, contact technical support for interpretation of the codes.

31xx, 60xx, SA300, and SA600 system POST error messages


POST error messages might appear on the system console if your system encounters errors while the
BIOS and boot loader initiate the hardware.

0200: Failure Fixed Disk


Note: Always power-cycle your system when you receive this message. If the system repeats the
error message, follow the corrective action for the error message.

Message

0200: Failure Fixed Disk

Description

A disk error occurred.

Corrective action

Complete the following steps to see whether the CompactFlash card is bad:
1. Enter the following command at the boot environment prompt:
boot_diags

158 | Hardware Platform Monitoring Guide


2. Select the cf-card test.
3. If the test shows that the CompactFlash card is bad, replace it.
If the CompactFlash card is good, replace the motherboard.

0230: System RAM Failed at offset:


Note: Always power-cycle your system when you receive this message. If the system repeats the
error message, follow the corrective action for the error message.

Message

0230: System RAM Failed at offset

Description

The BIOS cannot initialize the system memory or a DIMM has failed.

Corrective
action

Check the DIMMs and replace any bad ones by completing the following steps:
1. Make sure that each DIMM is seated properly, and then power-cycle the
system.
2. If the problem persists, run the diagnostics to determine which DIMMs failed
by entering the following command at the boot environment prompt:
boot_diags

3. Select the following test: mem.


4. Replace the failed DIMMs.

0231: Shadow RAM failed at offset


Note: Always power-cycle your system when you receive this message. If the system repeats the

error message, follow the corrective action for the error message.
Message

0231: Shadow RAM failed at offset

Description

The BIOS cannot initialize the system memory or a DIMM has failed.

Corrective
action

Check the DIMMs and replace any bad ones by completing the following steps:
1. Make sure that each DIMM is seated properly, and then power-cycle the
system.
2. If the problem persists, run the diagnostics to determine which DIMMs failed
by entering the following command at the boot environment prompt:
boot_diags

3. Select the following test: mem.


4. Replace the failed DIMMs.

Startup messages | 159

0232: Extended RAM failed at address line


Note: Always power-cycle your system when you receive this message. If the system repeats the
error message, follow the corrective action for the error message.

Message

0232: Extended RAM failed at address line

Description

The BIOS cannot initialize the system memory or a DIMM has failed.

Corrective
action

Check the DIMMs and replace any bad ones by completing the following steps:
1. Make sure that each DIMM is seated properly, and then power-cycle the
system.
2. If the problem persists, run the diagnostics to determine which DIMMs failed
by entering the following command at the boot loader prompt:
boot_diags

3. Select the following test: mem.


4. Replace the failed DIMMs.

0235: Multiple-bit ECC error occurred


Note: Always power-cycle your system when you receive this message. If the system repeats the
error message, follow the corrective action for the error message.

Message

0235: Multiple-bit ECC error occurred

Description

The BIOS cannot initialize the system memory or a DIMM has failed.

Corrective
action

Check the DIMMs and replace any bad ones by completing the following steps:
1. Make sure that each DIMM is seated properly, and then power-cycle the
system.
2. If the problem persists, run the diagnostics to determine which DIMMs failed
by entering the following command at the boot loader prompt:
boot_diags

3. Select the following test: mem.


4. Replace the failed DIMMs.

023C: Bad DIMM found in slot #


Note: Always power-cycle your system when you receive this message. If the system repeats the

error message, follow the corrective action for the error message.

160 | Hardware Platform Monitoring Guide


Message

023C: Bad DIMM found in slot #

Description

The BIOS cannot initialize the system memory or a DIMM has failed.

Corrective
action

Check the DIMMs and replace any bad ones by completing the following steps:
1. Make sure that each DIMM is seated properly, and then power-cycle the
system.
2. If the problem persists, run the diagnostics to determine which DIMMs failed
by entering the following command at the boot loader prompt:
boot_diags

3. Select the following test: mem.


4. Replace the failed DIMMs.

023E: Node Memory Interleaving disabled


Note: Always power-cycle your system when you receive this message. If the system repeats the
error message, follow the corrective action for the error message.

Message

023E: Node Memory Interleaving disabled

Description

A bad DIMM was detected, which causes BIOS to disable node interleaving.

Corrective
action

Check the DIMMs and replace any bad ones by completing the following steps:
1. Make sure that each DIMM is seated properly, and then power-cycle the
system.
2. If the problem persists, run the diagnostics to determine which DIMMs failed
by entering the following command at the boot loader prompt:
boot_diags

3. Select the following test: mem.


4. Replace the failed DIMMs.

0241: Agent Read Timeout


Note: Always power-cycle your system when you receive this message. If the system repeats the
error message, follow the corrective action for the error message.

Message

0241: Agent Read Timeout

Description

Timeout occurs when BIOS tries to read or write information through System
Management Bus (SMBUS) or Inter-Integrated Circuit (I2C).

Startup messages | 161


Run the Agent diagnostic test.

Corrective
action

1. Enter the following command at the boot loader prompt:


boot_diags

2. Select and run the following tests: agent, 2, and 6.


3. Select and run the following tests: mb, 2, and 8.

0242: Invalid FRU information


Note: Always power-cycle your system when you receive this message. If the system repeats the
error message, follow the corrective action for the error message.

Message

0242: Invalid FRU information

Description

The information from the field-replaceable unit (FRU) Electrically Erasable


Programmable Read-Only Memory (EEPROM) is invalid.

Corrective
action

1. Enter the following command at the boot environment prompt:


boot_diags

2. To determine the FRU involved, select the following tests: mb and 74.
3. Check whether the FRUs model name, serial number, part number, and
revision are correct in one of the following ways:

Visually inspect the FRU.


Look for error messages indicating that the FRU information is invalid or
could not be read.

4. Contact technical support if you suspect a misprogrammed FRU.

0250: System battery is dead


Note: Always power-cycle your system when you receive this message. If the system repeats the

error message, follow the corrective action for the error message.
Message

0250: System battery is deadReplace and run SETUP

Description

The real-time clock (RTC) battery is dead.

Corrective action

1. Reboot the system.


2. If the problem persists, replace the RTC battery.
3. Reset the RTC.

162 | Hardware Platform Monitoring Guide

0251: System CMOS checksum bad


Note: Always power-cycle your system when you receive this message. If the system repeats the
error message, follow the corrective action for the error message.

Message

0251: System CMOS checksum badDefault configuration used

Description

CMOS checksum is bad, possibly because the system was reset during BIOS
boot or because of a dead RTC battery.

Corrective action 1. Reboot the system.


2. If the problem persists, replace the RTC battery.
3. Reset the RTC.

0253: Clear CMOS jumper detected


This message occurs only on 60xx and SA600 systems.
Note: Always power-cycle your system when you receive this message. If the system repeats the
error message, follow the corrective action for the error message.

Message

0253: Clear CMOS jumper detectedPlease remove for normal


operation.

Description

The clear CMOS jumper is installed on the main board.

Corrective action Remove the clear CMOS jumper and reset the system.

0260: System timer error


Note: Always power-cycle your system when you receive this message. If the system repeats the
error message, follow the corrective action for the error message.

Message

0260: System timer error

Description

The system clock is not ticking.

Corrective action

Replace the HT1000 chip.

0280: Previous boot incomplete


Note: Always power-cycle your system when you receive this message. If the system repeats the
error message, follow the corrective action for the error message.

Message

0280: Previous boot incompleteDefault configuration used

Description

The previous boot was incomplete, and the default configuration was used.

Startup messages | 163


Corrective action Reboot the system.

02C2: No valid Boot Loader in System FlashNon Fatal


Note: Always power-cycle your system when you receive this message. If the system repeats the
error message, follow the corrective action for the error message.

Message

02C2: No valid Boot Loader in System FlashNon Fatal

Description

No valid boot loader is found in system flash memory while the option to Halt
For Invalid Boot Loader is disabled in setup. As a result, the system still can
boot from CompactFlash if it has a valid boot loader.

Corrective
action

Enter the update_flash command two times to place a good boot loader in the
system flash.

02C3: No valid Boot Loader in System FlashFatal


Note: Always power-cycle your system when you receive this message. If the system repeats the
error message, follow the corrective action for the error message.

Message

02C3: No valid Boot Loader in System FlashFatal

Description

No valid boot loader is found in system flash memory while the option to Halt
For Invalid Boot Loader is enabled in setup. As a result, the system halts. You
should take corrective action.

Corrective
action

Place a valid version of the boot loader in the system flash by completing either of
the following series of steps:
1. Boot from the backup boot image.
2. Enter the update_flash command.
or
1. Enter BIOS setup and disable boot from system flash.
2. Save the setting.
3. Reboot to the boot environment prompt, and then enter the update_flash
command two times.

02F9: FPGA jumper detected


This message occurs only on 60xx and SA600 systems.
Note: Always power-cycle your system when you receive this message. If the system repeats the
error message, follow the corrective action for the error message.

164 | Hardware Platform Monitoring Guide


Message

02F9: FPGA jumper detectedPlease remove for normal


operation

Description

The Field Programmable Gate Array (FPGA) jumper was installed on the
motherboard.

Corrective action 1. Remove the FPGA jumper.


2. Reboot the system.

02FA: Watchdog Timer Reboot (PciInit)


Note: Always power-cycle your system when you receive this message. If the system repeats the
error message, follow the corrective action for the error message.

Message

02FA: Watchdog Timer Reboot (PciInit)

Description

The watchdog times out while BIOS is doing PCI initialization.

Corrective
action

1. Power-cycle the system a few times or reset the system through the RLM.
2. If the problem persists, check the PCI interface by entering the following
command at the boot environment prompt:
boot_diags

3. Select and run the following tests: mb, 4, and 71.


4. Replace the motherboard if the diagnostics show a problem.

02FB: Watchdog Timer Reboot (MemTest)


This message appears only on 60xx and SA600 systems.
Note: Always power-cycle your system when you receive this message. If the system repeats the
error message, follow the corrective action for the error message.

Message

02FB: Watchdog Timer Reboot (MemTest)

Description

The watchdog times out while BIOS is testing the extended memory.

Corrective
action

1. Power-cycle the system a few times or reset the system through the RLM.
2. If the problem persists, check the memory interface by entering the following
command at the boot loader prompt:
boot_diags

3. Select and run the following tests: mem and 1


4. Replace the DIMMs if the diagnostics show a problem.

Startup messages | 165


5. Replace the motherboard if the problem persists.

02FC: LDTStop Reboot (HTLinkInit)


Note: Always power-cycle your system when you receive this message. If the system repeats the
error message, follow the corrective action for the error message.

Message

02FC: LDTStop Reboot (HTLinkInit)

Description The watchdog times out while BIOS is setting up the HT link speed.
1. Power-cycle the system a few times or reset the system through the Remote LAN
Module (RLM).
2. If the problem persists, replace the motherboard.

No message on console
Note: Always power-cycle your system when you receive this message. If the system repeats the
error message, follow the corrective action for the error message.

Message

No message on console. Problem might be reported in the


Remote LAN Module (RLM) system event log with the code 037h
or in the SMBIOS system event log (SEL) with the error code
237h.

Description

There is not enough memory to accommodate SMBIOS structure.

Corrective
action

Either remove some adapters from PCI slots, or check the DIMMs and replace any
bad ones by completing the following steps:
1. Make sure that each DIMM is seated properly, and then power-cycle the
system.
2. If the problem persists, run the diagnostics to determine which DIMMs failed
by entering the following command at the boot loader prompt:
boot_diags

3. Select the following test: mem.


4. Replace the failed DIMMs.

166 | Hardware Platform Monitoring Guide

FAS22xx, FAS25xx, 32xx, 62xx, FAS80xx, SA320, and SA620


system POST error messages
POST error messages might appear on the system console if your system encounters errors while the
BIOS and boot loader initiate the hardware.

0200: Failure Fixed Disk


Message

0200: Failure Fixed Disk

Description

A disk error occurred.

Corrective action

Replace the USB boot device.

SP error code

000h

0230: System RAM Failed at offset


Message

0230: System RAM Failed at offset:

Description

The BIOS cannot initialize the system memory or a DIMM has failed.

Corrective action

Check and replace the bad DIMM modules.

SP error code

030h

0231: Shadow RAM Failed at offset


Message

0231: Shadow RAM Failed at offset:

Description

The BIOS cannot initialize the system memory or a DIMM has failed.

Corrective action

Check and replace the bad DIMM modules.

SP error code

031h

0232: Extended RAM Failed at address line


Message

0232: Extended RAM Failed at address line:

Description

The BIOS cannot initialize the system memory or a DIMM has failed.

Corrective action

Check and replace the bad DIMM modules.

SP error code

032h

Startup messages | 167

023A: ONTAP Detected Bad DIMM in slot


Message

023A: ONTAP Detected Bad DIMM in slot:

Description

Data ONTAP detected a bad DIMM and disabled it in the displayed DIMM
slot.

Corrective action Check and replace the bad DIMM modules.


SP error code

03Ah

023B: BIOS detected SPD checksum error in DIMM slot:


Message

023B: BIOS detected SPD checksum error in DIMM slot:

Description

The system BIOS detected an SPD (serial presence detect) checksum error in
the specified DIMM slot.

Corrective action Check and replace the bad DIMM modules.


SP error code

03Bh

023E: Node Memory Interleaving disabled


Message

023E: Node Memory Interleaving disabled

Description

Node memory interleaving is disabled due to defective DIMM modules.

Corrective action

Check and replace the bad DIMM modules.

SP error code

03Eh

0241: SMBus Read Timeout


Message

0241: SMBus Read Timeout

Description

A timeout occurs when the BIOS tries to read or write information through the
System Management Bus (SMBUS) or the Inter-Integrated Circuit (I2C).

Corrective action Run system-level diagnostics to check the SMBUS.


SP error code

041h

0242: Invalid FRU information


Message

0242: Invalid FRU information

Description

The information from the field-replaceable unit (FRU) Electrically Erasable


Programmable Read-Only Memory (EEPROM) is invalid.

168 | Hardware Platform Monitoring Guide


Corrective action Program the FRU information through the SP or system-level diagnostics.
SP error code

042h

0250: System battery is dead - Replace and run SETUP


Message

0250: System battery is dead - Replace and run SETUP

Description

The real-time clock (RTC) battery is dead.

Corrective action

Replace the CMOS battery.

SP error code

050h

0251: System CMOS checksum bad


Message

0251: System CMOS checksum bad -- Default configuration


used

Description

The CMOS checksum is bad, possibly because the system was reset during a
BIOS boot or because of a dead RTC battery.

Corrective action None. The BIOS corrects the error automatically, and the system continues its
normal boot.
SP error code

051h

0260: System timer error


Message

0260: System timer error

Description

The system clock is not ticking.

Corrective action

Replace the chipset.

SP error code

060h

0271: Check date and time settings


Message

0271: Check date and time settings

Description

The date or time setting is invalid.

Corrective action

1. Set the date and time in a proper range.


2. Make sure that the RTC battery is in and not dead.

SP error code

071h

Startup messages | 169

0280: Previous boot incomplete - Default configuration used


Message

0280: Previous boot incomplete -- Default configuration


used

Description

The previous boot attempt was incomplete, causing the system to boot with the
default BIOS configuration.

Corrective action Reboot the system.


080h

SP error code

02A1: SP Not Found


Message

02A1: SP Not Found

Description

SP does not respond or SP hangs.

Corrective action

Check and replace the SP.

SP error code

0A2h

02A2: System Error Log (SEL) Full


Message

02A2: System Error Log (SEL) Full

Description

SP system error log (SEL) is full.

Corrective action

Clear the SEL log for SP.

SP error code

0A2h

02A3: No Response From SP To FRU ID Read Request


Messages

02A3: No Response From SP To FRU ID Read Request

Description

The Service Processor fails to respond to the FRU ID read request.

Corrective action

Check and replace the Service Processor.

SP error code

0A3h

02C2: No valid Boot Loader in System Flash - Non Fatal


Message

02C2: No valid Boot Loader in System Flash - Non Fatal

Description

No valid boot loader is found in the system flash memory while the option to
Halt For Invalid Boot Loader is disabled in setup. As the result, the system still
can boot from the boot media if it has a valid boot loader.

170 | Hardware Platform Monitoring Guide


Corrective
action

Take one of the following actions:

If the system can boot to the boot loader prompt through the boot media, run
the following command to place a good boot loader in system flash:
flash

If the system cannot boot to the boot loader prompt through the boot media,
boot from the backup image through the SP, and then enter the following
command to place a good boot loader in the corrupted portion of system
flash:
flash

SP error code

0C2h

02C3: No valid Boot Loader in System Flash - Fatal


Message

02C3: No valid Boot Loader in System Flash - Fatal

Description

No valid boot loader is found in the system flash memory while the option to
Halt For Invalid Boot Loader is enabled in setup. As the result, the system
halts. Users should take corrective action.

Corrective action Place a valid version of the boot loader in the system flash by completing the
following steps:
1. Boot the system from the backup boot image.
2. Enter the following command:
flash

SP error code

0C3h

BIOS detected errors or invalid configuration in DIMM slot:


Message

BIOS detected errors or invalid configuration in DIMM slot:

Description

BIOS detected unknown errors in the displayed DIMM.

Corrective action Check and replace the bad DIMM modules.


SP error code

038h

BIOS detected pattern write/read mismatch in DIMM slot:


Message

BIOS detected pattern write/read mismatch in DIMM slot:

Description

The system BIOS detected a pattern write/read mismatch in the displayed


DIMM slot. Read/write mismatches indicate defective memory modules.

Startup messages | 171


Corrective action Check and replace the bad DIMM modules identified.
SP error code

03Ch

BIOS detected uncorrectable ECC error in DIMM slot:


Message

BIOS detected uncorrectable ECC error in DIMM slot:

Description

The BIOS detected an uncorrectable ECC error in the displayed DIMM slot.

Corrective action Check and replace the bad DIMM modules.


SP error code

035h

BIOS detected unknown errors in DIMM slot:


Message

BIOS detected unknown errors in DIMM slot:

Description

The system BIOS detected unknown errors in the displayed DIMM. These
errors might indicate defective memory modules.

Corrective action Check and replace the bad DIMM modules.


SP error code

038h

Fatal Error! All DIMM failed and system can not continue boot!
Message

Fatal Error! All DIMM failed and system can not continue
boot!

Description

All DIMMs are mapped out either as bad or having the disable flag set. The
system has no memory to continue.

Corrective action Complete the following steps:


1. Clear the CMOS.
2. Power-cycle the system.
3. If the problem persists, replace all DIMMs.
SP error code

N/A

Fatal Error! All channels are disabled!


Message

Fatal Error! All channels are disabled!

Description

All channels of DIMM are disabled.

Corrective action

Complete the following steps:

172 | Hardware Platform Monitoring Guide


1. Clear the CMOS.
2. Power-cycle the system.
3. If the problem persists, replace all DIMMs.
0EAh

SP error code

Fatal Error! RDIMMs and UDIMMs are mixed!


Message

Fatal Error! RDIMMs and UDIMMs are mixed!

Description

The registered dual inline memory modules (RDIMMs) and unregistered dual
inline memory modules (UDIMMs) are mixed in the system.

Corrective action Make sure that the RDIMMs and UDIMMs are not mixed. For information
about the correct memory for your system, contact technical support.
SP error code

0EDh

Fatal Error! UDIMM in 3rd slot is not supported!


Message

Fatal Error! UDIMM in 3rd slot is not supported!

Description

An unregistered dual inline memory module (UDIMM) is populated in the third


slot.

Corrective action Make sure that an unregistered dual inline memory module (UDIMM) is not
plugged into the third slot.
SP error code

0EEh

Fatal Error: No DIMM detected and system can not continue boot!
Message

Fatal Error: No DIMM detected and system can not continue


boot!

Description

All DIMM serial presence detect (SPD) EEPROMs are inaccessible due to the
hanging of the Inter-Integrated Circuit (I2C) switch for System Management
Bus (SMBUS). The system regards the condition as if there were no DIMMs on
the system.

Corrective action Complete the following steps:


1. If the message persists, try to power-cycle the system.
2. If the problem persists after power-cycling the system, replace the
motherboard.
SP error code

0E8h

Startup messages | 173

No Response to Controller FRU ID Read Request via IPMI


Message

No Response to Controller FRU ID Read Request via IPMI

Description

SP does not respond to a controller FRU information inquiry.

Corrective action

Check and replace the SP.

SP error code

0A4h

No Response to Midplane FRU ID Read Request via IPMI


Message

No Response to Midplane FRU ID Read Request via IPMI

Description

The SP does not respond to a midplane FRU information inquiry.

Corrective action

Check and replace the SP.

SP error code

0A5h

No message on the console


Message

No message on the console.

Description

There is not enough memory to accommodate the SMBIOS structure.

Corrective action

Check for and replace any bad DIMM modules.

SP error code

037h

SP FRU Entry is Blank or Checksum Error


Message

SP FRU Entry is Blank or Checksum Error

Description

FRU information is invalid.

Corrective action

Check and replace the FRU.

SP error code

0A3h

Software memory test failed!


Message

Software memory test failed!

Description

The software memory test failed in memory reference code (MRC) checking.

Corrective action Check and replace the bad DIMM modules.


SP error code

0EBh

174 | Hardware Platform Monitoring Guide

Boot error messages


Boot error messages might appear after the hardware passes all POSTs and your system encounters
errors while loading the operating system.

Boot device err


Message

Boot device err

Description

A CompactFlash card could not be found to boot from.

Corrective action

Insert a valid CompactFlash card.

Cannot initialize labels


Message

Cannot initialize labels

Description

When the system tries to create a new file system, it cannot initialize the disk
labels.

Corrective action Usually, you do not need to create and initialize a file system; do so only after
consulting technical support.

Cannot read labels


Message

Cannot read labels

Description

When your system tries to initialize a new file system, it has a problem reading
the disk labels it wrote to the disks.
This problem can be because the system failed to read the disk size, or the
written disk labels were invalid.

Corrective
action

Usually, you do not need to create and initialize a file system; do so only after
consulting technical support.

Configuration exceeds max PCI space


Message

Configuration exceeds max PCI space

Description

The memory space for mapping PCI adapters has been exhausted, for one of two
reasons:

There are too many PCI adapters in the system


An adapter is demanding too many resources

Startup messages | 175


Corrective
action

1. Verify that all expansion adapters in your system are supported.


2. Contact technical support for help. Have a list ready of all expansion
adapters installed in your system.

DIMM slot # has correctable ECC errors


Message

DIMM slot # has correctable ECC errors

Description

The specified DIMM slot has correctable error correction code (ECC) errors.

Corrective action Run diagnostics on your DIMMs. If the problem persists, replace the specified
DIMM.

Dirty shutdown in degraded mode


Message

Dirty shutdown in degraded mode

Description

The file system is inconsistent because you did not shut down the system
cleanly when it was in degraded mode.

Corrective action Contact technical support for instructions about repairing the file system.

Disk label processing failed


Message

Disk label processing failed

Description

Your system detects that the disk is not in the correct drive bay.

Corrective action

Make sure that the disk is in the correct bay.

Drive %s.%d not supported


Message

Drive %s.%d not supported

Description

%sThe disk number; %dThe disk ID number. The system detects an


unsupported disk drive.

Corrective
action

1. Remove the drive immediately or the system drops down to the


programmable ROM (PROM) monitor within 30 seconds.
2. Check the Hardware Universe at hwu.netapp.com to verify support for your
disk drive.

Error detection detected too many errors to analyze at once


Message

Error detection detected too many errors to analyze at once

176 | Hardware Platform Monitoring Guide


Description

This message occurs when other error messages occur at the same time.

Corrective action See the other error messages and their respective corrective actions. If the
problem persists, contact technical support.

FC-AL loop down, adapter %d


Message

FC-AL loop down, adapter %d

Description

The system cannot detect the Fibre Channel-Arbitrated Loop (FC-AL) loop or
adapter.

Corrective
action

1. Identify the adapter by entering the following command:


storage show adapter

2. Turn off the power on your system and verify that the adapter is properly
seated in the expansion slot.
3. Verify that all Fibre Channel cables are connected.

File system may be scrambled


Message

File system may be scrambled

Descriptions
and corrective
actions

The following table lists errors that cause the file system to become inconsistent
and steps you can take to correct the problem.
Description

Corrective action

An unclean shutdown when your


system is in degraded mode and when
NVRAM is not working.

Contact technical support to learn how


to start the system from a system boot
diskette and repair the file system.

The number of disks detected in the


Make sure that all disks on the system
disk array is different from the number are properly installed in the disk
of disks recorded in the disk labels.
shelves.
The system cannot start when more
than one disk is missing.
The system encounters a read error
while reconstructing parity.

Contact technical support for help.

A disk failed at the same time the


system crashed.

Contact technical support to learn how


to repair the file system.

Startup messages | 177

Halted disk firmware too old


Message

Halted disk firmware too old

Description

The disk firmware is an old version.

Corrective action

Update the disk firmware by entering the following command:


disk_fw_update

Halted: Illegal configuration


Message

Halted: Illegal configuration

Description

Incorrect HA pair.

Corrective action

1. Check the console for details.


2. Verify that all cables are correctly connected.

Invalid PCI card slot %d


Message

Invalid PCI card slot %d

Description

%dThe expansion slot number. The system detects an adapter that is not
supported.

Corrective action Replace the unsupported adapter with an adapter that is included in the
Hardware Universe at hwu.netapp.com.

No /etc/rc
Message

No /etc/rc

Description

The /etc/rc file is corrupted.

Corrective
action

1. At the hostname> prompt, enter


setup

2. As the system prompts for system configuration information, use the


information you recorded in your system configuration information
worksheet in the Getting Started Guide.
For more information about your system setup program, see the appropriate
system administration guide.

178 | Hardware Platform Monitoring Guide

No disk controllers
Message

No disk controllers

Description

The system cannot detect any Fibre Channel-Arbitrated Loop (FC-AL) disk
controllers.

Corrective action 1. Turn off your system power.


2. Verify that all NICs are properly seated in the appropriate expansion slots.

No disks
Message

No disks

Description

The system cannot detect any Fibre Channel-Arbitrated Loop (FC-AL) disks.

Corrective action Verify that all disks are properly seated in the drive bays.

No /etc/rc, running setup


Message

No /etc/rc, running setup

Description

The system cannot find the /etc/rc file and automatically starts setup.

Corrective action As the system prompts for system configuration information, use the
information you recorded in your system configuration information worksheet
in the Getting Started Guide.
For more information about your system setup program, see the appropriate
system administration guide.

No network interfaces
Message

No network interfaces

Description

The system cannot detect any network interfaces.

Corrective action 1. Turn off the system and verify that all network interface cards (NICs) are
seated properly in the appropriate expansion slots.
2. Run diagnostics to check the onboard Ethernet port.
3. If the problem persists, contact technical support.

No NVRAM present
Message

No NVRAM present

Startup messages | 179


Description

The system cannot detect the NVRAM adapter.

Corrective action Make sure that the NVRAM adapter is securely installed in the appropriate
expansion slot.

NVRAM #n downrev
Message

NVRAM #n downrev

Description

nThe serial number of the nonvolatile RAM (NVRAM) adapter. The


NVRAM adapter is an early revision that cannot be used with the system.

Corrective action Check the console for information about which revision of the NVRAM adapter
is required. Replace the NVRAM adapter.

NVRAM: wrong pci slot


Message

NVRAM: wrong pci slot

Description

The system cannot detect the nonvolatile RAM (NVRAM) adapter.

Corrective action

For a stand-alone 3020 or FAS3050 system, make sure that the NVRAM
adapter is in slot 1.
For a 3020 or FAS3050 system in an HA pair, make sure that the NVRAM
adapter is in slot 2.

Panic: DIMM slot #n has uncorrectable ECC errors


Message

Panic: DIMM slot #n has uncorrectable ECC errors. Replace these DIMMS.

Description

The specified DIMM has uncorrectable ECC errors.

Corrective action Replace the specified DIMM.

This platform is not supported on this release


Message

This platform is not supported on this release. Please consult the release notes.
Please downgrade to a supported release! Shutting down: EOL platform

Description

This platform is not supported on this release. Please consult the release notes
for your software.

Corrective
action

You must downgrade your software version to a compatible release.


Verify that you have the correct URL for software download.

180 | Hardware Platform Monitoring Guide

Too many errors in too short time


Message

Too many errors in too short time

Description

The error detection system is experiencing problems. This message occurs


when other error messages occur at the same time.

Corrective action See the other error messages and their respective corrective actions. If the
problem persists, call technical support.

Warning: Motherboard Revision not available


Message

Warning: Motherboard Revision not available. Motherboard is not


programmed.

Description

The system motherboard is not programmed with the correct revision.

Corrective action Replace the motherboard.

Warning: Motherboard Serial Number not available


Message

Warning: Motherboard Serial Number not available. Motherboard is not


programmed

Description

The system motherboard is not programmed with the correct serial number.

Corrective action Replace the motherboard.

Warning: system serial number is not available


Message

Warning: system serial number is not available. System backplane is not


programmed.

Description

The backplane of your system does not have the correct system serial number.

Corrective action Report the problem to technical support so that your system can be replaced.

Watchdog error
Message

Watchdog error

Description

An error occurred during the testing of the watchdog timer.

Corrective action

Replace the motherboard.

Watchdog failed
Message

Watchdog failed

Startup messages | 181


Description

Your system watchdog reset hardware, used to reset your system from a system
hang condition, is not functioning properly.

Corrective action Replace the motherboard.

182 | Hardware Platform Monitoring Guide

EMS and operational messages


You might encounter various messages on your system during normal operation.
The Event Management System (EMS) collects event data from various parts of the Data ONTAP
kernel and displays information about those events in AutoSupport messages. EMS messages appear
on your system console or LCD and provide information about disk drives, disk shelves, system
power supply, system fans, and acceleration modules.
Operational error messages might appear on your system console or LCD when the system is
operating, when it is halted, or when it is restarting because of system problems.
Additional information about messages that appear on your system console or in logs might be
available through the Syslog Translator tool on the NetApp Support Site at support.netapp.com/
eservice/ems.

Environmental EMS messages


EMS messages appear on the console and in AutoSupport messages if your system encounters
extremes in its operational environment. They also appear on the LCD display if your system has
one.
In 31xx systems, both controllers in a chassis share the power supplies. As a result, the system is
never shut down because of a single power supply failure. Removing one power supply does not shut
down the system.
Degraded power might be caused by bad power supplies, bad power, or bad components on the
motherboard. If spare power supplies are available, try replacing them to see whether that alleviates
the problem.

Chassis fan FRU failed


Message

Chassis fan FRU failed: current speed is 4272 RPM, on [time stamp].

LCD display

Fans stopped; replace them

LED behavior

FRU LED: Green if problem is PSU; off if problem is fan.

Description

This message occurs when a system fan fails.

Corrective action

Check LEDs on the fans and power supply.

SNMP trap ID

If both fan LEDs are green, run diagnostics on the power supplies.
If the fan LED is off, replace the fan.

#414: Chassis fan is degraded

EMS and operational messages | 183

Chassis over temperature on XXXX


LCD display

Temperature exceeds limits

Message

Chassis over temperature on XXXX at [time stamp].

Description

This message occurs when the system is operating above the high-temperature
threshold.

Corrective action 1. Make sure that the system has proper ventilation.
2. Power-cycle the system and run diagnostics on the system.
SNMP trap ID

#372: Chassis temperature is too hot

Chassis over temperature shutdown on XXXX


Message

Chassis over temperature shutdown on XXXX at [time stamp].

LCD display

Temperature exceeds limits

Description

This message occurs when the system is operating above the high-temperature
threshold. The system shuts down immediately.

Corrective action 1. Make sure that the system has proper ventilation.
2. Power-cycle the system and run diagnostics on the system.
SNMP trap ID

#371: Chassis temperature is too hot

Chassis Power Degraded: 3.3V in warn high state


Message

Chassis Power Degraded: 3.3V is in warn high state current voltage is 3273 mV
on XXXX at [time stamp].

LCD display

Power supply degraded

Description

This message occurs when the system is operating above the high-voltage
threshold.

Corrective action Your action depends on whether the power supply is present.

SNMP trap ID

If the power supply is not inserted, insert it.


If the power supply is inserted, power-cycle your system and run
diagnostics on the identified power supply. If the problem persists, replace
the identified power supply.

#403: Chassis power is degraded

184 | Hardware Platform Monitoring Guide

Chassis power degraded: PS#


Message

Chassis Power degraded: PS#

LCD display

Power supply degraded

LED behavior

FRU LED: Amber

Description

This message occurs when there is a problem with one of the power supplies.

Corrective action 1. Check that the power supply is seated properly in its bay and that all power
cords are connected.
2. Power-cycle your system and run diagnostics on the identified power
supply.
3. If the problem persists, replace the identified power supply.
SNMP trap ID

#392: Chassis power supply is degraded

Note: In 31xx systems, both controllers in a chassis share the power supplies. As a result, the
system is never shut down because of a single power supply failure. Removing one power supply
does not shut down the system.

Chassis Power Fail: PS#


Message

Chassis Power Fail: PS#

LCD display

Power supply degraded

Description

This message occurs when the power supply fails.

Corrective action Your action depends on whether the power supply is present.

SNMP trap ID

If the power supply is not inserted, insert it.


If the power supply is inserted, power-cycle your system and run
diagnostics on the identified power supply. If the problem persists, replace
the identified power supply.

#6: Chassis power is degraded

Chassis Power Shutdown


Message

Chassis Power Shutdown: Chassis Power Supply Fail: PS#

LCD display

Power supply degraded

LED behavior

FRU LED: Amber

EMS and operational messages | 185


Description

This message occurs when the system is in a warning state. The system shuts
down immediately.

Corrective action Your action depends on whether the power supply is present.

SNMP trap ID

If the power supply is not inserted, insert it.


If the power supply is inserted, power-cycle your system and run
diagnostics on the identified power supply. If the problem persists, replace
the identified power supply.

#392: Chassis power supply is degraded

Note: In 31xx systems, both controllers in a chassis share the power supplies. As a result, the
system is never shut down because of a single power supply failure. Removing one power supply
does not shut down the system.

Chassis power shutdown: 3.3V in warn low state


Message

Chassis power shutdown: 3.3V is in warn low state current voltage is 3273 mV
on XXXX at [time stamp].

LCD display

Power supply degraded

Description

This message occurs when the system is operating below the low-voltage
threshold. The system shuts down immediately.

Corrective action Your action depends on whether the power supply is present.

SNMP trap ID

If the power supply is not inserted, insert it.


If the power supply is inserted, power-cycle your system and run
diagnostics on the identified power supply. If the problem persists, replace
the identified power supply.

#403: Chassis power is degraded

Chassis Power Supply: PS# removed


Message

Chassis Power Supply: PS# removed system will shutdown in 2 minutes

LCD display

Power supply degraded

LED behavior

FRU LED: Amber

Description

This message occurs when the power supply unit is removed from the system.
The system will shut down unless the power supply is replaced.

Corrective action Your action depends on whether the power supply is present.

If the power supply is not inserted, insert it.

186 | Hardware Platform Monitoring Guide

SNMP trap ID

If the power supply is inserted, power-cycle your system and run


diagnostics on the identified power supply. If the problem persists, replace
the identified power supply.

#501: Chassis power supply is degraded

Chassis power supply degraded: PS#


Note: This message appears only on 31xx systems.

Message

Chassis power supply degraded: PS#

LED behavior

FRU LED: Amber

Description

This message occurs when there is a problem with one of the power supplies.

Corrective action 1. Check that the power supply is seated properly in its bay and that all power
cords are connected.
2. Power-cycle your system and run diagnostics on the identified power
supply.
3. If the problem persists, replace the identified power supply.
SNMP trap ID

#392: Chassis power supply is degraded

Chassis power supply fail: PS#


LCD display

Power supply degraded

Message

Chassis power supply fail: PS#

Description

This message occurs when the system is operating below the low-voltage
threshold. The system shuts down immediately.

Corrective action Your action depends on whether the power supply is present.

SNMP trap ID

If the power supply is not inserted, insert it.


If the power supply is inserted, power-cycle your system and run
diagnostics on the identified power supply. If the problem persists, replace
the identified power supply.

N/A

Chassis power supply off: PS#


Note: This message appears only on 31xx systems.

Message

Chassis Power supply off: PS#

EMS and operational messages | 187


LED behavior

FRU LED: Off

Description

This message occurs when the power supply unit is turned off.

Corrective action Your action depends on whether the power supply is present:

SNMP trap ID

If the power supply is present and is switched off, turn the switch on.
If the power supply is present and turned on, power-cycle your system and
run diagnostics on the identified power supply.
If the problem persists, replace the identified power supply.

#395: Power supply not present

Chassis power supply off: PS#


Message

Chassis power supply off: PS#

LCD display

Power supply degraded

Description

This message occurs when one or more chassis power supplies are turned off.

Corrective action Your action depends on whether the power supply is present.

SNMP trap ID

If the power supply is not inserted, insert it.


If the power supply is inserted, power-cycle your system and run
diagnostics on the identified power supply. If the problem persists, replace
the identified power supply.

#395: Power supply not present

Chassis power supply OK: PS#


Note: This message appears only on 31xx systems.

Message

Chassis power supply OK: PS#

LED behavior

FRU LED: Green

Description

This message occurs when the power supply is operating normally.

Corrective action

None.

SNMP trap ID

#397: Chassis power supply (%id) is OK

Chassis power supply removed: PS#


Note: This message appears only on 31xx systems.

Message

Chassis power supply removed: PS#

LED behavior

N/A

188 | Hardware Platform Monitoring Guide


Description

This message occurs when the power supply unit is removed from the system.

Corrective action Your action depends on whether the power supply is present:

SNMP trap ID

If the power supply is not inserted, insert it.


If the power supply is inserted, power-cycle your system and run
diagnostics on the identified power supply.
If the problem persists, replace the identified power supply.

#394: I/O expansion module is not present in the chassis

Chassis under temperature on XXXX


Message

Chassis under temperature on XXXX at [time stamp].

LCD display

Temperature exceeds limits

Description

This message occurs when the system is operating below the low-temperature
threshold.

Corrective action 1. Raise the ambient temperature around the system.


2. Power-cycle the system and run diagnostics on the system.
SNMP trap ID

#372: Chassis temperature is too cold

Chassis under temperature shutdown on XXXX


Message

Chassis under temperature shutdown on XXXX at [time stamp].

LCD display

Temperature exceeds limits

Description

This message occurs when the system is operating below the low-temperature
threshold. The system shuts down immediately.

Corrective action 1. Check that the system has proper ventilation. You might need to raise the
ambient temperature around the system.
2. Power-cycle the system and run diagnostics on the system.
SNMP trap ID

#371: Chassis temperature is too cold

Fan: # is spinning below tolerable speed


Message

Fan: # is spinning below tolerable speed replace immediately to avoid


overheating

LCD display

Fans stopped; replace them

Description

This message occurs when one or more chassis fans is spinning too slowly.

EMS and operational messages | 189


Corrective action Check LEDs on the fans.

SNMP trap ID

If both fan LEDs are green, run diagnostics on the motherboard


If the fan LED is off, replace the fan.

#415: Chassis fan is degraded

monitor.chassisFan.degraded
Message

monitor.chassisFan.degraded

Severity

ALERT

Description

This message is issued when a chassis fan is degraded.

Corrective action

The fan unit should be replaced.

SNMP trap ID

#412 Chassis fan is degraded: %s

monitor.chassisFan.ok
Message

monitor.chassisFan.ok

Severity

NOTICE

Description

This message occurs when the chassis fans are OK.

Corrective action

N/A

SNMP trap ID

#366 Chassis FRU is OK

monitor.chassisFan.removed
Message

monitor.chassisFan.removed

Severity

ALERT

Description

This message occurs when a chassis fan is removed.

Corrective action

Replace the fan unit.

SNMP trap ID

#363 Chassis FRU is removed

monitor.chassisFan.slow
Message

monitor.chassisFan.slow

Severity

ALERT

Description

This message occurs when a chassis fan is spinning too slowly.

Corrective action

Replace the fan unit.

190 | Hardware Platform Monitoring Guide


SNMP trap ID

#365 Chassis FRU contains at least one fan spinning slowly

monitor.chassisFan.stop
Message

monitor.chassisFan.stop

Severity

ALERT

Description

This message occurs when a chassis fan is stopped.

Corrective action

Replace the fan unit.

SNMP trap ID

#364 Chassis FRU contains at least one stopped fan

monitor.chassisFan.warning
Message

monitor.chassisFan.warning

Severity

ALERT

Description

This message is issued when a chassis fan is spinning either too slowly or too
fast. This is a warning message.

Corrective action The fan unit should be replaced.


SNMP trap ID

#415 Chassis fan is in warning state

monitor.chassisFanFail.xMinShutdown
Message

monitor.chassisFanFail.xMinShutdown

Severity

EMERG

Description

This message indicates that multiple chassis fans have failed and the system
will shut down in few minutes unless corrected.

Corrective action Make sure the system fans are working.


SNMP trap ID

#511 Multiple Chassis Fan failure: System will shut down in 2 minutes.

monitor.chassisPower.degraded
Message

monitor.chassisPower.degraded

Severity

NOTICE

Description

This message indicates that a power supply is degraded.

Corrective action

1. If spare power supplies are available, try replacing them to see whether
that alleviates the problem.

EMS and operational messages | 191


2. Otherwise, contact technical support for further instruction.
SNMP trap ID

#403 Chassis power is degraded

monitor.chassisPower.ok
Message

monitor.chassisPower.ok

Severity

NOTICE

Description

This messages indicates that the motherboard power is OK.

Corrective action

N/A

SNMP trap IP

#406 Normal operation

monitor.chassisPowerSupplies.ok
Message

monitor.chassisPowerSupplies.ok

Severity

INFO

Description

This message indicates that all power supplies are OK.

Corrective action

N/A

SNMP trap ID

#396 Normal operation

monitor.chassisPowerSupply.degraded
Message

monitor.chassisPowerSupply.degraded

Severity

INFO

Description

This message indicates that a power supply is degraded.

Corrective action

A replacement power supply might be required. Contact technical support for


further instruction.

SNMP trap ID

#392 Chassis power supply is degraded

monitor.chassisPowerSupply.notPresent
Message

monitor.chassisPowerSupply.notPresent

Severity

NOTICE

Description

This message indicates that a power supply is not present.

Corrective action

Replace the power supply.

SNMP trap ID

#394 Power supply not present

192 | Hardware Platform Monitoring Guide

monitor.chassisPowerSupply.off
Message

monitor.chassisPowerSupply.off

Severity

NOTICE

Description

This message indicates that a power supply is turned off.

Corrective action

Turn on the power supply.

SNMP trap ID

#395 Power supply not present

monitor.chassisPowerSupply.ok
Message

monitor.chassisPowerSupply.ok

Severity

INFO

Description

This message indicates the power supply is OK

Corrective action

None.

SNMP trap ID

# 397 Chassis power supply (%id) is OK

monitor.chassisTemperature.cool
Message

monitor.chassisTemperature.cool

Severity

ALERT

Description

This message occurs when the chassis temperature is too cool.

Corrective action

Raise the temperature around the system.

SNMP trap ID

#372 Chassis temperature is too cool

monitor.chassisTemperature.ok
Message

monitor.chassisTemperature.ok

Severity

NOTICE

Description

This message occurs when the chassis temperature is normal.

Corrective action

N/A

SNMP trap ID

#376 Normal operation

monitor.chassisTemperature.warm
Message

monitor.chassisTemperature.warm

EMS and operational messages | 193


Severity

ALERT

Description

This message occurs when the chassis temperature is too warm.

Corrective action

Check to see whether air conditioning units are needed, or whether they are
functioning properly.

SNMP trap ID

#372 Chassis temperature is too warm

monitor.cpuFan.degraded
Message

monitor.cpuFan.degraded

Severity

NOTICE

Description

This message indicates that a CPU fan is degraded.

Corrective action

1. Replace the identified fan.


2. Power-cycle the system and run diagnostics on the system.

SNMP trap ID

#383 A CPU fan is not operating properly

monitor.cpuFan.failed
Message

monitor.cpuFan.failed

Severity

NOTICE

Description

This message indicates that a CPU fan is degraded.

Corrective action

1. Replace the identified fan.


2. Power-cycle the system and run diagnostics on the system.

SNMP trap ID

#381: CPU fan is stopped

monitor.cpuFan.ok
Message

monitor.cpuFan.ok

Severity

INFO

Description

This message indicates that a CPU fan is OK.

Corrective action

N/A

SNMP trap ID

#386 Normal operation

194 | Hardware Platform Monitoring Guide

monitor.ioexpansion.unpresent
Message

monitor.ioexpansion.unpresent

Severity

NOTICE

Description

This message occurs when the I/O expansion module is not inserted into the
chassis.

Corrective action None.


SNMP trap ID

#394: I/O expansion module is not present in the chassis.

monitor.ioexpansionPower.degraded
Message

monitor.ioexpansionPower.degraded

Severity

NOTICE

Description

This message indicates that power on the I/O expansion module is degraded.

Corrective action Degraded power might be caused by bad power supplies, bad wall power, or
bad components on the motherboard. If spare power supplies are available, try
exchanging them to see whether the problem is resolved. Otherwise, contact
technical support.
SNMP trap ID

#403 Power on IO expansion is degraded:

monitor.ioexpansionPower.ok
Message

monitor.ioexpansionPower.ok

Severity

NOTICE

Description

This messages indicates that power on the I/O expansion module is OK.

Corrective action

None.

SNMP trap ID

#406 Power on IO expansion module is OK

monitor.ioexpansionTemperature.cool
Message

monitor.ioexpansionTemperature.cool

Severity

ALERT

Description

This warning message occurs when the I/O expansion module is too cold.

Corrective action The system cannot function in an environment that is too cold; find ways to
warm the system.

EMS and operational messages | 195


SNMP trap ID

#372 I/O expansion module is too cold:

monitor.ioexpansionTemperature.ok
Message

monitor.ioexpansionTemperature.ok

Severity

NOTICE

Description

This message occurs when the temperature of the I/O expansion module is
normal. It can occur for the following two cases: 1) LOG_NOTICE to show
that a bad condition has reverted to normal. 2) LOG_INFO for hourly to
indicate that the temperature is OK.

Corrective action None.


SNMP trap ID

#376 Temperature of the I/O expansion module is OK.

monitor.ioexpansionTemperature.warm
Message

monitor.ioexpansionTemperature.warm

Severity

ALERT

Description

This warning message occurs when the I/O expansion module is too warm.

Corrective action Evaluate the environment in which the system is functioning: Are air
conditioning units needed or is the current air conditioning not functioning
properly?
SNMP trap ID

#372 I/O expansion module is too warm:

monitor.nvmembattery.warninglow
Message

monitor.nvmembattery.warninglow

Severity

WARNING

Description

This message occurs when the NVMEM (nonvolatile memory) lithium battery
is low on power.

Corrective action Replace the NVMEM battery as soon as practical.


SNMP trap ID

#63 NVMEM battery is low on power and should be replaced as soon as


practical.

monitor.nvramLowBattery
Message

monitor.nvramLowBattery

Severity

NODE_ERROR

196 | Hardware Platform Monitoring Guide


Description

This message occurs when the NVRAM batteries are discovered to be at a


dangerously low power level.

Corrective action Contact technical support.


SNMP trap ID

N/A

monitor.power.unreadable
Message

monitor.power.unreadable

Severity

INFO

Description

This message occurs when a power sensor in the controller module is not
readable.

Corrective action Shut down the system and power-cycle the controller module. If the sensor is
still not readable, replace the controller module.
SNMP trap ID

N/A

monitor.shutdown.cancel
Message

monitor.shutdown.cancel

Severity

WARNING

Description

This message is issued when an automatic shutdown sequence has been


canceled.

Corrective action None.


SNMP trap ID

#6 Automatic shutdown sequence canceled

monitor.shutdown.cancel.nvramLowBattery
Message

monitor.shutdown.cancel.nvramLowBattery

Severity

WARNING

Description

This message is issued when an automatic shutdown sequence has been


postponed due to RAID reconstruction.

Corrective action Unknown


SNMP trap ID

#6 NVRAM battery is dangerously Low. Halt delayed until %s finishes.

monitor.shutdown.chassisOverTemp
Message

monitor.shutdown.chassisOverTemp

EMS and operational messages | 197


Severity

CRIT

Description

This message occurs just before shutdown, indicating that the chassis
temperature is too hot.

Corrective action Check to see if air conditioning units are needed, or whether they are
functioning properly.
#371 Chassis temperature is too hot

monitor.shutdown.chassisUnderTemp
Message

monitor.shutdown.chassisUnderTemp

Severity

CRIT

Description

This message occurs just before shutdown, indicating that the chassis
temperature becomes too cold.

Corrective action Raise the temperature around the system.


SNMP trap ID

#371 Chassis temperature is too cold

monitor.shutdown.emergency
Message

monitor.shutdown.emergency

Severity

NODE_FAULT

Description

This message is issued when an emergency shutdown is initiated.

Corrective action

None.

SNMP trap ID

#6 Emergency shutdown: %s

monitor.shutdown.ioexpansionOverTemp
Message

monitor.shutdown.ioexpansionOverTemp

Severity

CRIT

Description

This message occurs when the I/O expansion module is too hot. This message
is sent just before shutdown.

Corrective action The system environment is too hot; cool the environment.
SNMP trap ID

#371 I/O expansion module is too hot:

monitor.shutdown.nvramLowBattery.pending
Message

monitor.shutdown.nvramLowBattery.pending

198 | Hardware Platform Monitoring Guide


Severity

WARNING

Description

This message is issued when an automatic shutdown sequence is pending due to


a low battery.

Corrective action Replace the battery.


SNMP trap ID

#62 Emergency shutdown: NVRAM battery dangerously low in degraded


mode. Replace the battery immediately!

monitor.temp.unreadable
Message

monitor.temp.unreadable

Severity

INFO

Description

This message occurs when the controller module temperature is not readable.
The system does not automatically shut down if it becomes too hot for reliable
operation.

Corrective action Shut down the system and power-cycle the controller module. If the
temperature is still not readable, replace the controller module.
SNMP trap ID

N/A

Multiple chassis fans have failed


Message

Multiple chassis fans have failed; system will shut down in 2 minutes.

LCD display

Fans stopped; replace them.

Description

This message occurs during a multiple chassis fan failure. The system shuts
down in two minutes if this condition is uncorrected.

Corrective action 1. Replace both fans.


2. Power-cycle and run diagnostics on the system.
SNMP trap ID

#511: Chassis fan is degraded

Multiple fan failure on XXXX


Message

Multiple fan failure on XXXX at [time stamp].

LCD display

Fans stopped; replace them.

LED behavior

FRU LED: Amber

Description

This message occurs when both system fans fail. The system shuts down
immediately.

EMS and operational messages | 199


Corrective action 1. Replace both fans.
2. Power-cycle and run diagnostics on the system.
SNMP trap ID

#6 Emergency shutdown

Multiple power supply fans failed


Message

Multiple power supply fans failed; system will shut down in 2 minutes.

LCD display

Power supply degraded

Description

This message occurs when multiple power supplies and fans have failed. The
system shuts down in two minutes if this condition is uncorrected.

Corrective action Your action depends on whether the power supply is present.

SNMP trap ID

If the power supply is not inserted, insert it.


If the power supply is inserted, power-cycle your system and run
diagnostics on the identified power supply. If the problem persists, replace
the identified power supply.

#521: Chassis power is degraded

nvmem.battery.capacity.low
Message

nvmem.battery.capacity.low

Severity

NODE_ERROR

Description

This message occurs when the NVMEM battery lacks the capacity to preserve
the NVMEM contents for the required minimum of 72 hours. The system is at
the risk of data loss if the power fails. This message repeats every hour while the
problem continues and the system shuts down in 24 hours if automatic
recharging of the battery does not restore its charge.

Corrective
action

Correct any environmental problems, such as chassis over-temperature. The


battery charges automatically. If the capacity is not restored in several hours,
replace the battery pack. If the problem persists, replace the controller module.

SNMP trap ID

N/A

nvmem.battery.capacity.low.warn
Message

nvmem.battery.capacity.low.warn

Severity

INFO

Description

This message occurs when the NVMEM battery capacity is below normal.

200 | Hardware Platform Monitoring Guide


Corrective action

None.

SNMP trap ID

N/A

nvmem.battery.capacity.normal
Message

nvmem.battery.capacity.normal

Severity

INFO

Description

This message occurs when the NVMEM battery capacity is normal.

Corrective action

None.

SNMP trap ID

N/A

nvmem.battery.current.high
Message

nvmem.battery.current.high

Severity

NODE_ERROR

Description

This message occurs when the NVMEM battery current is excessively high and
the system will shut down.

Corrective action First, correct any environmental problems, such as chassis overtemperature. If
the NVMEM battery current is still too high, replace the battery pack. If the
problem persists, replace the controller module.
SNMP trap ID

N/A

nvmem.battery.current.high.warn
Message

nvmem.battery.current.high.warn

Severity

INFO

Description

This message occurs when the NVMEM battery current is above normal.

Corrective action

INFO

SNMP trap ID

N/A

nvmem.battery.sensor.unreadable
Message

nvmem.battery.sensor.unreadable

Severity

INFO

Description

This message occurs when the battery state of the battery-backed memory
(NVMEM) is unknown. One of the battery sensors is not readable.

EMS and operational messages | 201


Corrective action Shut down the system and power-cycle the controller module. If the problem is
not corrected, replace the battery. If the sensor is still not readable, replace the
controller module.
SNMP trap ID

N/A

nvmem.battery.temp.high
Message

nvmem.battery.temp.high

Severity

NODE_ERROR

Description

This message occurs when the NVMEM battery is too hot and the system is at a
high risk of data loss if power fails.

Corrective action If the system is excessively warm, allow it to cool gradually. If the NVMEM
battery temperature reading is still too high, replace the battery pack. If the
problem persists, replace the controller module.
SNMP trap ID

N/A

nvmem.battery.temp.low
Message

nvmem.battery.temp.low

Severity

NODE_ERROR

Description

This message occurs when the NVMEM battery is too cold and the system is at
a high risk of data loss if power fails.

Corrective action If the system is excessively cold, allow it to warm gradually. If the NVMEM
battery temperature reading is still too low, replace the battery pack. If the
problem persists, replace the controller module.
SNMP trap ID

N/A

nvmem.battery.temp.normal
Message

nvmem.battery.temp.normal

Severity

INFO

Description

This message occurs when the NVMEM battery temperature is normal.

Corrective action

None.

SNMP trap ID

N/A

202 | Hardware Platform Monitoring Guide

nvmem.battery.voltage.high
Message

nvmem.battery.voltage.high

Severity

NODE_ERROR

Description

This message occurs when the NVMEM battery voltage is excessively high and
the system will shut down.

Corrective action First, correct any environmental problems, such as chassis overtemperature. If
the NVMEM battery voltage is still too high, replace the battery pack. If the
problem persists, replace the controller module.
SNMP trap ID

N/A

nvmem.battery.voltage.high.warn
Message

nvmem.battery.voltage.high.warn

Severity

INFO

Description

This message occurs when the NVMEM battery voltage is above normal.

Corrective action

None.

SNMP trap ID

N/A

nvmem.battery.voltage.normal
Message

nvmem.battery.voltage.normal

Severity

INFO

Description

This message occurs when the NVMEM battery voltage is normal.

Corrective action

None.

SNMP trap ID

N/A

nvmem.voltage.high
Message

nvmem.voltage.high

Severity

NODE_ERROR

Description

This message occurs when the NVMEM supply voltage is high and the system
is at a high risk of data loss if power fails.

Corrective action First, correct any environmental or battery problems. If the problem continues,
replace the controller module.

EMS and operational messages | 203


SNMP trap ID

N/A

nvmem.voltage.high.warn
Message

nvmem.voltage.high.warn

Severity

INFO

Description

This message occurs when the NVMEM supply voltage is above normal.

Corrective action

None.

SNMP trap ID

N/A

nvmem.voltage.normal
Message

nvmem.voltage.normal

Severity

INFO

Description

This message occurs when the NVMEM supply voltage is normal.

Corrective action

None.

SNMP trap ID

N/A

nvram.bat.missing.error
Message

nvram.bat.missing.error

Severity

NODE_ERROR

Description

This message occurs when the battery in the chassis is degrading.

Corrective action

Contact technical support.

SNMP trap ID

N/A

nvram.battery.capacity.low
Message

nvram.battery.capacity.low

Severity

NODE_ERROR

Description

This message occurs when the NVRAM battery lacks the capacity to preserve
the NVRAM contents for the required minimum of 72 hours. The system is at
the risk of data loss if the power fails. This message repeats every hour while the
problem continues, and the system shuts down in 24 hours if automatic
recharging of the battery does not restore its charge.

204 | Hardware Platform Monitoring Guide


Corrective
action

Correct any environmental problems, such as chassis over-temperature. The


battery charges automatically. If the capacity is not restored in several hours,
replace the battery pack. If the problem persists, replace the controller module.

SNMP trap ID

N/A

nvram.battery.capacity.low.critical
Message

nvram.battery.capacity.low.critical

Severity

NODE_ERROR

Description

This message occurs when the NVRAM battery capacity is dangerously low.
To prevent data loss, the system will shut down in 20 minutes

Corrective action Correct any environmental problems, such as chassis over-temperature. The
battery charges automatically. If the capacity is not restored automatically,
replace the battery pack. If the problem persists, replace the controller module.
SNMP trap ID

N/A

nvram.battery.capacity.low.warn
Messages

nvram.battery.capacity.low.warn

Severity

INFO

Description

This message occurs when the NVRAM battery capacity is below normal.

Corrective action

None.

SNMP trap ID

N/A

nvram.battery.capacity.normal
Message

nvram.battery.capacity.normal

Severity

INFO

Description

This message occurs when the NVRAM battery capacity is normal

Corrective action

None.

SNMP trap ID

N/A

nvram.battery.charging.nocharge
Message

nvram.battery.charging.nocharge

Severity

NODE_ERROR

EMS and operational messages | 205


Description

This message occurs when the NVRAM battery is requesting to be charged but
the charger is not charging the battery. To prevent data loss, the system will
shut down in 20 minutes.

Corrective action Replace the NVRAM battery/card. If the problem persists, replace the
controller module.
SNMP trap ID

N/A

nvram.battery.charging.normal
Message

nvram.battery.charging.normal

Severity

INFO

Description

This message occurs when the NVRAM battery charging status is normal.

Corrective action

None.

SNMP trap ID

N/A

nvram.battery.charging.wrongcharge
Message

nvram.battery.charging.wrongcharge

Severity

NODE_ERROR

Description

This message occurs when the NVRAM battery charger is charging the battery
even though the battery is not requesting to be charged. To prevent data loss,
the system will be shut down in 20 minutes.

Corrective action Replace the NVRAM battery. If the problem persists, replace the NVRAM
card.
SNMP trap ID

N/A

nvram.battery.current.high
Message

nvram.battery.current.high

Severity

NODE_ERROR

Description

This message occurs when the NVRAM battery current is excessively high and
the system will shut down.

Corrective action First, correct any environmental problems, such as chassis over-temperature. If
the NVRAM battery current is still too high, replace the battery pack. If the
problem persists, replace the controller module
SNMP trap ID

N/A

206 | Hardware Platform Monitoring Guide

nvram.battery.current.high.warn
Message

nvram.battery.current.high.warn

Severity

INFO

Description

This message occurs when the NVRAM battery current is above normal.

Corrective action

None.

SNMP trap ID

N/A

nvram.battery.current.low
Message

nvram.battery.current.low

Severity

NODE_ERROR

Description

This message occurs when the NVRAM battery has a short circuit.

Corrective action Replace the NVRAM battery/card. If the problem persists, replace the
controller module
SNMP trap ID

N/A

nvram.battery.current.low.warn
Message

nvram.battery.current.low.warn

Severity

NODE_ERROR

Description

This message occurs when the NVRAM battery current is below normal.

Corrective action First, correct any environmental problems. If the NVRAM battery current is
still below normal, replace the NVRAM battery/card. If the problem persists,
replace the controller module.
SNMP trap ID

N/A

nvram.battery.current.normal
Message

nvram.battery.current.normal

Severity

INFO

Description

This message occurs when the NVRAM battery current is normal.

Corrective action

None.

SNMP trap ID

N/A

EMS and operational messages | 207

nvram.battery.end_of_life.high
Message

nvram.battery.end_of_life.high

Severity

INFO

Description

This message occurs when the NVRAM battery-cycle count indicates that the
battery has reached its anticipated life expectancy.

Corrective action None.


SNMP trap ID

N/A

nvram.battery.end_of_life.normal
Message

nvram.battery.end_of_life.normal

Severity

INFO

Description

This message occurs when the NVRAM battery-cycle count indicates that the
battery is well below its anticipated life expectancy.

Corrective action None.


SNMP trap ID

N/A

nvram.battery.fault
Message

nvram.battery.fault

Severity

NODE_ERROR

Description

This message occurs when the NVRAM battery is reporting a fatal fault
condition. To prevent data loss, the system will shut down in 2 minutes.

Corrective action Correct any environmental problems, such as chassis over-temperature. If the
battery still reports a fatal fault condition, replace the NVRAM battery/card. If
the problem persists, replace the controller module.
SNMP trap ID

N/A

nvram.battery.fault.warn
Message

nvram.battery.fault.warn

Severity

INFO

Description

This message occurs when the NVRAM battery is reporting a non-fatal fault
condition.

Corrective action Correct any environmental problems, such as chassis over-temperature.

208 | Hardware Platform Monitoring Guide


SNMP trap ID

N/A

nvram.battery.fcc.low
Message

nvram.battery.fcc.low

Severity

NODE_ERROR

Description

This message occurs when the NVRAM battery full-charge capacity is low. To
prevent data loss, the system will shut down in 24 hours.

Corrective action First, correct any environmental problems, such as chassis over-temperature. If
the NVRAM full-charge capacity is still dangerously low, replace the NVRAM
battery/card. If the problem persists, replace the controller module.
SNMP trap ID

N/A

nvram.battery.fcc.low.critical
Message

nvram.battery.fcc.low.critical

Severity

NODE_ERROR

Description

This message occurs when the NVRAM battery full-charge capacity is


dangerously low. To prevent data loss, the system will shut down in 20
minutes.

Corrective action First, correct any environmental problems, such as chassis over-temperature. If
the NVRAM full-charge capacity is still dangerously low, replace the NVRAM
battery/card. If the problem persists, replace the controller module.
SNMP trap ID

N/A

nvram.battery.fcc.low.warn
Message

nvram.battery.fcc.low.warn

Severity

INFO

Description

This message occurs when the NVRAM battery full-charge capacity is below
normal.

Corrective action Replace the NVRAM battery/card during your next scheduled down-time
(within 3 months).
SNMP trap ID

N/A

nvram.battery.fcc.normal
Message

nvram.battery.fcc.normal

EMS and operational messages | 209


Severity

INFO

Description

This message occurs when the NVRAM battery full-charge capacity is normal.

Corrective action None.


SNMP trap ID

N/A

nvram.battery.power.fault
Message

nvram.battery.power.fault

Severity

NODE_ERROR

Description

This message occurs when the NVRAM battery is not getting powered.

Corrective action Correct any environmental problems such as chassis over-temperature. If the
NVRAM battery is still not getting power, replace the NVRAM battery/card. If
the problem persists, replace the controller module.
SNMP trap ID

N/A

nvram.battery.power.normal
Message

nvram.battery.power.normal

Severity

INFO

Description

This message occurs when the NVRAM battery power is normal.

Corrective action

None.

SNMP trap ID

N/A

nvram.battery.sensor.unreadable
Messages

nvram.battery.sensor.unreadable

Severity

INFO

Description

This message occurs when the battery state of the battery-backed memory
(NVRAM) is unknown. One of the battery sensors is not readable.

Corrective action Shut down the system and power-cycle the controller module. If the problem is
not corrected, replace the NVRAM battery/card. If the sensor is still not
readable, replace the controller module.
SNMP trap ID

N/A

210 | Hardware Platform Monitoring Guide

nvram.battery.temp.high
Message

nvram.battery.temp.high

Severity

NODE_ERROR

Description

This message occurs when the NVRAM battery is too hot and the system is at a
high risk of data loss if power fails.

Corrective action If the system is excessively warm, allow it to cool gradually. If the NVRAM
battery temperature reading is still too high, replace the battery pack. If the
problem persists, replace the controller module.
SNMP trap ID

N/A

nvram.battery.temp.high.warn
Message

nvram.battery.temp.high.warn

Severity

INFO

Description

This message occurs when the NVRAM battery temperature is high.

Corrective action

None.

SNMP trap ID

N/A

nvram.battery.temp.low
Message

nvram.battery.temp.low

Severity

NODE_ERROR

Description

This message occurs when the NVRAM battery is too cold and the system is at
a high risk of data loss if power fails.

Corrective action If the system is excessively cold, allow it to warm gradually. If the NVRAM
battery temperature reading is still too low, replace the battery pack. If the
problem persists, replace the controller module.
SNMP trap ID

N/A

nvram.battery.temp.low.warn
Message

nvram.battery.temp.low.warn

Severity

INFO

Description

This message occurs when the NVRAM battery temperature is low.

Corrective action

None.

EMS and operational messages | 211


SNMP trap ID

N/A

nvram.battery.temp.normal
Message

nvram.battery.temp.normal

Severity

INFO

Description

This message occurs when the NVRAM battery temperature is normal.

Corrective action

None.

SNMP trap ID

N/A

nvram.battery.voltage.high
Message

nvram.battery.voltage.high

Severity

NODE_ERROR

Description

This message occurs when the NVRAM battery voltage is excessively high and
the system will shut down.

Corrective action First, correct any environmental problems, such as chassis over-temperature. If
the NVRAM battery voltage is still too high, replace the battery pack. If the
problem persists, replace the controller module.
SNMP trap ID

N/A

nvram.battery.voltage.high.warn
Message

nvram.battery.voltage.high.warn

Severity

INFO

Description

This message occurs when the NVRAM battery voltage is above normal.

Corrective action

None.

SNMP trap ID

N/A

nvram.battery.voltage.low
Message

nvram.battery.voltage.low

Severity

NODE_ERROR

Description

This message occurs when the NVRAM battery voltage is critically low. To
prevent data loss, the system will shut down in 2 minutes.

212 | Hardware Platform Monitoring Guide


Corrective action First correct any environmental problem, such as chassis over-temperature. If
the NVRAM battery voltage is still critically low, replace the NVRAM battery/
card. If the problem persists, replace the controller module.
SNMP trap ID

N/A

nvram.battery.voltage.low.warn
Message

nvram.battery.voltage.low.warn

Severity

INFO

Description

This message occurs when the NVRAM battery voltage is below normal. To
prevent data loss, the system will shut down in 24 hours.

Corrective action First, correct any environmental problems such as chassis over-temperature. If
the NVRAM battery voltage is still below normal, replace the NVRAM battery/
card. If the problem persists, replace the controller module.
SNMP trap ID

N/A

nvram.battery.voltage.normal
Message

nvram.battery.voltage.normal

Severity

INFO

Description

This message occurs when the NVRAM battery voltage is normal.

Corrective action

None.

SNMP trap ID

N/A

nvram.hw.initFail
Message

nvram.hw.initFail

Severity

ERR

Description

This message occurs when the Data ONTAP NVRAM hardware fails to
initialize.

Corrective action Typically, this type of error is unexpected and indicates that the NVRAM
hardware is failing and should be replaced. Contact technical support for
assistance with the replacement.
SNMP trap ID

N/A

EMS and operational messages | 213

FCoE HBA EMS messages


FCoE messages appear if the CNA (Converged Network Adapter) MPI (Management Port Interface)
driver detects an unexpected event or illegal condition or if the HBA fails to initialize.

ispcna.mpi.dump
Message

ispcna.mpi.dump

Severity

SVC_ERROR

Description

This message occurs when an unexpected event or illegal condition is detected


by the CNA (Converged Network Adapter) Management Port Interface (MPI)
driver and the contents of the adapter's Static RAM and memory must be
dumped. After the dump, the adapter is reset and the contents of the dump are
stored in a file in the /etc/log/ql8mpi directory.

Corrective action None; the adapter was reset.

ispcna.mpi.dump.saved
Message

ispcna.mpi.dump.saved

Severity

SVC_ERROR

Description

This message occurs when an unexpected event or illegal condition is detected


by the CNA (Converged Network Adapter) Management Port Interface (MPI)
driver and the contents of the adapter's Static RAM and memory are saved. The
dump files are stored on the system's root volume in the /etc/log/ql8mpi
directory, with the following file name format: mpi[adapter]_[date]_[time].bin

Corrective
action

Send the dump file to technical support for analysis.

ispcna.mpi.initFailed
Message

ispcna.mpi.initFailed

Severity

NODE_ERROR

Description

This message occurs when the CNA (Converged Network Adapter) fails to
initialize.

Corrective action Take corrective actions based on the indicated reason for the failure.

214 | Hardware Platform Monitoring Guide

Flash Cache module and PAM module EMS messages


The caching module WAFL cache, hardware driver, and system monitoring can generate error
messages. All messages are reported through the EMS.
This document uses the term Flash Cache module to refer to caching modules with capacities greater
than 16 GB. Before the release of Data ONTAP 7.3.5, such adapters were called Performance
Acceleration Modules (PAM II). The name of the 16-GB caching module remains Performance
Acceleration Module (PAM I).

callhome.flash.cache.failed
Message

callhome.flash.cache.failed

Severity

NODE_ERROR

Description

This message occurs when Flash Management Module (FMM) detects that a
caching module has suffered a failure. Typically, this is the result of a hardware
failure on the caching module itself. FMM monitors all flash devices in the system.
If your system is configured to do so, it generates and transmits an AutoSupport (or
"call home") message to customer support and to the configured destinations.
Successful delivery of an AutoSupport message significantly improves problem
determination and resolution.

Corrective
action

A caching module has failed. This is either an indication or the cause of a


performance degradation. The exact impact cannot be estimated. This caching
module needs to be repaired or replaced. Contact customer support for more
details.

extCache.io.BlockChecksumError
Message

extCache.io.BlockChecksumError

Severity

NODE_ERROR

Description

This message occurs when the external cache detects a block checksum
verification error while performing a read operation. The operation will be
retried from persistent storage (RAID).

Corrective action Contact technical support.

extCache.io.cardError
Message

extCache.io.cardError

Severity

NODE_Error

EMS and operational messages | 215


Description

This message occurs when the external cache detects a card failure on read or
write I/O. If the I/O was a read, the operation will be retried from persistent
storage (RAID).

Corrective action Contact technical support.

extCache.io.readError
Message

extCache.io.readError

Severity

NODE_ERROR

Description

This message occurs when the external cache detects an I/O error on a read.
The operation will be retried from persistent storage (RAID).

Corrective action Contact technical support.

extCache.io.writeError
Message

extCache.io.writeError

Severity

NODE_ERROR

Description

This message occurs when the external cache detects an I/O error on a write.
This causes the external cache component to be disabled and might result in
degraded performance until the problem is corrected.

Corrective action Contact technical support.

extCache.offline
Message

extCache.offline

Severity

SVC_ERROR

Description

This message occurs when the external cache is automatically taken offline and
disabled. This can happen after an I/O error on the external cache and might
result in degraded performance until the problem is corrected. Check the Event
Management System (EMS) log for earlier errors.

Corrective action Contact technical support.

extCache.ReconfigComplete
Message

extCache.ReconfigComplete

Severity

NODE_ERROR

216 | Hardware Platform Monitoring Guide


Description

This message occurs when the Write Anywhere File Layout (WAFL) external
cache has detected a failure of one or more cache memory cards, and was able
to successfully reconfigure to continue operation with the remaining cards.

Corrective action None.

extCache.ReconfigFailed
Message

extCache.ReconfigFailed

Severity

NODE_ERROR

Description

This message occurs when an attempt to reconfigure the external cache has
failed. The message identifies what step of the reconfiguration failed.

Corrective action Contact technical support.

extCache.ReconfigStart
Message

extCache.ReconfigStart

Severity

NODE_ERROR

Description

This message occurs when the Write Anywhere File Layout (WAFL) external
cache has detected a failure of one or more cache memory cards. An attempt will
be made to restart the cache with the remaining card(s). Even if the cache is
restarted performance may be degraded due to the reduced size of cache
available. See related EMS messages for details of the failing unit.

Corrective
action

Contact technical support.

extCache.UECCerror
Note: If you have a 16-GB Performance Acceleration Module, and if more than 10 uncorrectable

ECC memory errors occur per day for three consecutive days, replace the module.
Message

extCache.UECCerror

Severity

NODE_ERROR

Description

This message occurs when an uncorrectable multi-bit ECC memory error is


reported to the Write Anywhere File Layout (WAFL) file system external cache.
When this event occurs the data will be re-read from persistent storage (RAID)
and operation continues. See related EMS messages for details about the failing
unit.

Corrective
action

If multiple uncorrectable multi-bit ECC errors are issued, this indicates that a
hardware component might be failing and should be considered for replacement.

EMS and operational messages | 217

extCache.UECCmax
Note: If you have a 16-GB Performance Acceleration Module, and if more than 10 uncorrectable
ECC memory errors occur per day for three consecutive days, replace the module.

Message

extCache.UECCmax

Severity

NODE_ERROR

Description

This message occurs when the Write Anywhere File Layout (WAFL) file
system external cache has detected excessive multi-bit uncorrectable ECC
memory errors in a recent period. When too many multi-bit ECC errors are
reported, WAFL disables the external cache until the failing component is
replaced, resulting in degraded performance. See related EMS messages for
details about the failing unit.

Corrective
action

Contact technical support.

fal.chan.offline.comp
Message

fal.chan.offline.comp

Severity

INFO

Description

This message occurs when the FAL (Flash Adaptation Layer) finishes taking a
channel offline.

Corrective action None.

fal.chan.online.erase.warn
Message

fal.chan.online.erase.warn

Severity

INFO

Description

This message occurs when an erase of a label block fails while attempting to
bring online a channel of a card. This could lead to a failure to read the label
(see the fal.chan.online.read.warn event).

Corrective action None.

fal.chan.online.fail
Message

fal.chan.online.fail

Severity

SVC_ERROR

218 | Hardware Platform Monitoring Guide


Description

This message occurs when the FAL (Flash Adaptation Layer) fails to bring
online a channel of a card for the mentioned reason.

Corrective action None.

fal.chan.online.read.warn
Message

fal.chan.online.read.warn

Severity

INFO

Description

This message occurs when the read of a label fails while attempting to bring online
a channel of a module. This is expected on the first boot with a Flash Cache
module. Otherwise, it means existing FAL (Flash Adaptation Layer) label
information is lost. The current version of software does not depend on label
information, so this loss is not a problem right now. However, future versions of
software might store cache data persistently. If persistent data is stored on a card
and this version of software is booted on such a system, failure to read the label
might lead to loss of some cached data.

Corrective
action

None.

fal.chan.online.rep.fail
Message

fal.chan.online.rep.fail

Severity

SVC_ERROR

Description

This message occurs when the FAL (Flash Adaptation Layer) fails to bring
online all channels in a caching module. The reasons for failure are listed in the
accompanying fal.chan.online.fail events.

Corrective action Contact technical support.

fal.chan.online.rep.part
Message

fal.chan.online.rep.part

Severity

SVC_ERROR

Description

This message occurs when the FAL (Flash Adaptation Layer) fails to bring
online some channels in a caching module. The reasons for failure are listed in
the accompanying fal.chan.online.fail events.

Corrective action Contact technical support.

EMS and operational messages | 219

fal.chan.online.rep.succ
Message

fal.chan.online.rep.succ

Severity

INFO

Description

This message occurs when the FAL (Flash Adaptation Layer) successfully
brings online all channels in a card.

Corrective action None.

fal.chan.online.rep.ver.err
Message

fal.chan.online.rep.ver.err

Severity

SVC_ERROR

Description

This message occurs when the FAL (Flash Adaptation Layer) fails to bring
online all channels in a caching module because of version mismatch.

Corrective action Follow the documented revert procedure.

fal.chan.online.write.warn
Message

fal.chan.online.write.warn

Severity

INFO

Description

This message occurs when a write of a label block fails while attempting to
bring online a channel of a module. This could lead to a failure to read the label
(see the fal.chan.online.read.warn event).

Corrective action None.

fal.init.failed
Message

fal.init.failed

Severity

SVC_ERROR

Description

This message occurs when the FAL (Flash Adaptation Layer) fails to initialize.
This error likely indicates a software bug.

Corrective action Contact technical support.

fmm.bad.block.detected
Message

fmm.bad.block.detected

Severity

DEBUG

220 | Hardware Platform Monitoring Guide


Description

This message occurs when Flash Management Module (FMM) gets a message
from a flash device driver reporting that a bad block is detected.

Corrective action None.

fmm.device.stats.missing
Message

fmm.device.stats.missing

Severity

DEBUG

Description

This message occurs when the onboard copy of statistics maintained by Flash
Management Module (FMM) are missing. This can happen when a device is
initially activated in the controller.

Corrective action None.

fmm.domain.card.failure
Message

fmm.domain.card.failure

Severity

SVC_ERROR

Description

This message occurs when the Flash Management Module (FMM) detects that
a flash device failed. Typically, this is the result of a hardware failure on the
flash device itself.

Corrective action Repair or replace the failed flash device.

fmm.domain.core.failure
Message

fmm.domain.core.failure

Severity

DEBUG

Description

This message occurs when Flash Management Module (FMM) detects that a
core domain on a flash device managed by FMM has failed. Typically, this is
the result of a hardware failure on the flash device itself. Core failure is not
considered to be fatal.

Corrective action None.

fmm.domain.lun.failure
Message

fmm.domain.lun.failure

Severity

DEBUG

Description

This message occurs when Flash Management Module (FMM) detects that a
LUN domain on a flash device managed by FMM has failed. Typically, this is

EMS and operational messages | 221


the result of a hardware failure on the flash device itself. LUN failure is not
considered fatal.
Corrective action None.

fmm.hourly.device.report
Message

fmm.hourly.device.report

Severity

DEBUG

Description

This message is sent by Flash Management Module (FMM) every hour, to


report the status of a flash device that FMM manages.

Corrective action None.

fmm.log.bb
Message

fmm.log.bb

Severity

DEBUG

Description

This message occurs when Flash Management Module (FMM) gets a message
from a flash device driver reporting that a bad block is detected.

Corrective action None.

fmm.threshold.bank.degraded
Message

fmm.threshold.bank.degraded

Severity

DEBUG

Description

This message occurs when Flash Management Module (FMM) detects that in a
flash device, the percentage of a bank that is offline is above a warning
threshold. FMM responds with the action described by the action parameter.

Corrective action None.

fmm.threshold.bank.offline
Message

fmm.threshold.bank.offline

Severity

DEBUG

Description

This message occurs when Flash Management Module (FMM) detects that in a
flash device, a critical percentage of a bank is offline, beyond which the bank
cannot operate. FMM responds with the action described by the action
parameter.

222 | Hardware Platform Monitoring Guide


Corrective action None.

fmm.threshold.card.degraded
Message

fmm.threshold.card.degraded

Severity

SVC ERROR

Description

This message occurs when the Flash Management Module (FMM) detects the
offline percentage of a flash device exceeds a specified warning threshold.
FMM responds with the action described by the action parameter.

Corrective action Repair or replace this degraded flash device.

fmm.threshold.card.failure
Message

fmm.threshold.card.failure

Severity

SVC_Error

Description

This message occurs when Flash Management Module (FMM) detects the
offline percentage of a flash device exceeds a specified critical threshold beyond
which the device cannot operate. FMM responds with the action described by
the action parameter.

Corrective
action

This flash device can no longer operate and will be taken offline. Repair or
replace the flash device.

fmm.threshold.core.offline
Message

fmm.threshold.core.offline

Severity

DEBUG

Description

This message occurs when Flash Management Module (FMM) detects that an
excessive number of blocks in a core of a flash device have gone bad. The
threshold for a core is defined as a percentage of bad blocks, and when that
threshold is exceeded, FMM responds with the action described by the action
parameter.

Corrective
action

None.

fmm.threshold.lun.offline
Message

fmm.threshold.lun.offline

Severity

DEBUG

EMS and operational messages | 223


Description

This message occurs when Flash Management Module (FMM) detects that an
excessive number of blocks in a flash device LUN have gone bad. The threshold
for a LUN is defined as a percentage of bad blocks, and when that threshold is
exceeded, FMM responds with the action described by the action parameter.

Corrective
action

None.

iomem.bbm.bbtl.overflow
Message

iomem.bbm.bbtl.overflow

Severity

NODE_ERROR

Description

This message occurs when the caching module driver detects that the Bad
Block Transaction Log has overflowed.

Corrective action None.

iomem.bbm.init.failed
Message

iomem.bbm.init.failed

Severity

NODE_ERROR

Description

This message occurs when the caching module driver detects that an operation
to a NOR flash memory has failed.

Corrective action None.

iomem.bbm.new.flash
Message

iomem.bbm.new.flash

Severity

DEBUG

Description

This message occurs when the caching module driver detects that a NAND
flash package has been replaced.

Corrective action None.

iomem.card.disable
Message

iomem.card.disable

Severity

WARNING

Description

This message occurs when the caching module has been disabled as a result of
an explicit diagnostic command.

224 | Hardware Platform Monitoring Guide


Corrective action None.

iomem.card.enable
Message

iomem.card.enable

Severity

INFO

Description

This message occurs when the caching module has been enabled as a result of
an explicit diagnostic command.

Corrective action None.

iomem.card.fail.cecc
Note: If you have a 16-GB Performance Acceleration Module, and if more than 10 uncorrectable
ECC memory errors occur per day for three consecutive days, replace the module.

Message

iomem.card.fail.cecc

Severity

NODE_ERROR

Description

This message occurs when the caching module driver takes an acceleration card
offline due to an excessive number of correctable memory errors.

Corrective action Replace the caching module.

iomem.card.fail.data.crc
Message

iomem.card.fail.data.crc

Severity

NODE_ERROR

Description

This message occurs when the caching module driver takes a caching module
offline due to an excessive number of detected data cyclic redundancy check
(CRC) errors.

Corrective action Replace the caching module.

iomem.card.fail.desc.crc
Message

iomem.card.fail.desc.crc

Severity

NODE_ERROR

Description

This message occurs when the caching module driver takes a caching module
offline due to an excessive number of detected descriptor cyclic redundancy
check (CRC) errors.

Corrective action Replace the caching module.

EMS and operational messages | 225

iomem.card.fail.dimm
Message

iomem.card.fail.dimm

Severity

NODE_ERROR

Description

This message occurs when the caching module driver takes a caching module
offline due to failure of a memory DIMMs.

Corrective action Replace the caching module.

iomem.card.fail.firmware.primary
Message

iomem.card.fail.firmware.primary

Severity

NODE_ERROR

Description

This messages occurs when the caching module driver detects that the module is
not running on the primary firmware image. The card does not function unless it
running on the primary image.

Corrective
action

Note: The following steps are for systems that use the SYSDIAG diagnostic tool.
32xx and 62xx systems use system-level diagnostics, which is a different
diagnostic tool. For details about using system-level diagnostics, see the SystemLevel Diagnostics Guide on the NetApp Support Site at mysupport.netapp.com.

1. Enter the following command at the boot environment prompt:


boot_diags

2. Select xtnd yes on the diagnostic main menu.


3. Take one of the following actions:

If your system has a 16-GB Performance Acceleration Module, select the


iomem submenu and then run test 62, Update FPGA [Extended].
If your system has a 256-GB or 512-GB Performance Acceleration Module,
select the pam2 submenu and then run test 61, Update FPGA [Extended].

4. Exit diagnostics and reboot the system.

iomem.card.fail.fpga
Message

iomem.card.fail.fpga

Severity

NODE_ERROR

Description

This message occurs when the caching module driver detects a fatal operational
error with the onboard field programmable gate array (FPGA) hardware and is
taking the caching module offline.

226 | Hardware Platform Monitoring Guide


Corrective action Contact technical support.

iomem.card.fail.fpga.primary
Message

iomem.card.fail.fpga.primary

Severity

NODE_ERROR

Description

This messages occurs when the acceleration card driver detects that the card is not
running on the primary firmware image. The card does not function unless it is
running on the primary image.

Corrective
action

Note: The following steps are for systems that use the SYSDIAG diagnostic
tool. 32xx and 62xx systems use system-level diagnostics, which is a different
diagnostic tool. For details about using system-level diagnostics, see the SystemLevel Diagnostics Guide on the NetApp Support Site at mysupport.netapp.com.
Take one of the following actions:

If you have a 16-GB Performance Acceleration Module, complete the following


steps:
1. Enter the following command at the boot environment prompt:
boot_diags

2. Select xtnd yes on the diagnostic main menu.


3. Run test 62, Update FPGA [Extended].
4. Exit diagnostics and reboot the system.
If you have a Flash Cache module, the FPGA firmware should be programmed
automatically. Other EMS messages earlier in the log should indicate why
programming failed.

iomem.card.fail.fpga.rev
Message

iomem.card.fail.fpga.rev

Severity

NODE_ERROR

Description

This message occurs when the caching module driver detects that the field
programmable gate array (FPGA) firmware image is a revision not supported by
the driver.

Corrective
action

Note: The following steps are for systems that use the SYSDIAG diagnostic
tool. 32xx and 62xx systems use system-level diagnostics, which is a different
diagnostic tool. For details about using system-level diagnostics, see the SystemLevel Diagnostics Guide on the NetApp Support Site at mysupport.netapp.com.
Take one of the following actions:

EMS and operational messages | 227

If you have a 16-GB Performance Acceleration Module, complete the


following steps:
1. Enter the following command at the boot environment prompt:
boot_diags

2. Select xtnd yes on the diagnostic main menu.


3. Run test 62, Update FPGA [Extended].

4. Exit diagnostics and reboot the system.


If you have a Flash Cache module, the FPGA firmware should be programmed
automatically. Other EMS messages earlier in the log should indicate why
programming failed.

iomem.card.fail.internal
Message

iomem.card.fail.internal

Severity

NODE_ERROR

Description

This message occurs when the caching module driver detects a fatal internal
error on the caching module and is taking the module offline.

Corrective action Contact technical support.

iomem.card.fail.pci
Message

iomem.card.fail.pci

Severity

NODE_ERROR

Description

This message occurs when the caching module driver detects a fatal PCI error
on the caching module and is taking the module offline.

Corrective action Contact technical support.

iomem.card.fail.uecc
Note: If you have a 16-GB Performance Acceleration Module, and if more than 10 uncorrectable
ECC memory errors occur per day for three consecutive days, replace the module.

Message

iomem.card.fail.uecc

Severity

NODE_ERROR

Description

This message occurs when the caching module driver takes a caching module
offline due to an excessive number of uncorrectable memory errors.

Corrective action Replace the caching module.

228 | Hardware Platform Monitoring Guide

iomem.dimm.log.checksum
Message

iomem.dimm.log.checksum

Severity

NODE_ERROR

Description

This message occurs when the caching module driver detects a checksum error
in the error log for a DIMM on the caching module.

Corrective action Replace the caching module.

iomem.dimm.log.init
Message

iomem.dimm.log.init

Severity

INFO

Description

This message occurs when the caching module driver initializes the error log
for a DIMM.

Corrective action None.

iomem.dimm.log.read
Message

iomem.dimm.log.read

Severity

NODE_ERROR

Description

This message occurs when the caching module driver fails to read the error log
for a DIMM on the caching module.

Corrective action Replace the caching module.

iomem.dimm.log.sync
Message

iomem.dimm.log.sync

Severity

INFO

Description

This message occurs when the caching module driver is writing the error log for
a DIMM to persistent storage.

Corrective action None.

iomem.dimm.log.write
Message

iomem.dimm.log.write

Severity

NODE_ERROR

EMS and operational messages | 229


Description

This message occurs when the caching module driver fails to write the error log
for a DIMM on the caching module.

Corrective action Replace the caching module.

iomem.dimm.mismatch.banks
Message

iomem.dimm.mismatch.banks

Severity

NODE_ERROR

Description

This message occurs when the caching module driver detects a DIMM with a
number of banks that does not match that of the other installed DIMMs on the
caching module.

Corrective action Replace the caching module.

iomem.dimm.mismatch.burst
Message

iomem.dimm.mismatch.burst

Severity

NODE_ERROR

Description

This message occurs when the caching module driver detects a DIMM with a
burst size that does not match that of the other installed DIMMs on the caching
module.

Corrective action Replace the caching module.

iomem.dimm.mismatch.casLatency
Message

iomem.dimm.mismatch.casLatency

Severity

NODE_ERROR

Description

This message occurs when the caching module driver detects a DIMM with a
column address select (CAS) that does not match that of the other installed
DIMMs on the caching module.

Corrective action Replace the caching module.

iomem.dimm.mismatch.columns
Message

iomem.dimm.mismatch.columns

Severity

NODE_ERROR

Description

This message occurs when the caching module driver detects a DIMM with a
number of columns that does not match that of the other installed DIMMs on
the caching module.

230 | Hardware Platform Monitoring Guide


Corrective action Replace the caching module.

iomem.dimm.mismatch.dataWidth
Message

iomem.dimm.mismatch.dataWidth

Severity

NODE_ERROR

Description

This message occurs when the caching module driver detects a DIMM with a
data synchronous dynamic RAM (SDRAM) width that does not match that of
the other installed DIMMs on the caching module.

Corrective action Replace the caching module.

iomem.dimm.mismatch.eccWidth
Note: If you have a 16-GB Performance Acceleration Module, and if more than 10 uncorrectable
ECC memory errors occur per day for three consecutive days, replace the module.

Message

iomem.dimm.mismatch.eccWidth

Severity

NODE_ERROR

Description

This message occurs when the caching module driver detects a DIMM with an
ECC synchronous dynamic RAM (SDRAM) width that does not match that of
the other installed DIMMs on the caching module.

Corrective action Replace the caching module.

iomem.dimm.mismatch.ranks
Message

iomem.dimm.mismatch.ranks

Severity

NODE_ERROR

Description

This message occurs when the caching module driver detects a DIMM with a
number of ranks that does not match that of the other installed DIMMs on the
caching module.

Corrective action Replace the caching module.

iomem.dimm.mismatch.rows
Message

iomem.dimm.mismatch.rows

Severity

NODE_ERROR

Description

This message occurs when the caching module driver detects a DIMM with a
number of rows that does not match that of the other installed DIMMs on the
caching module.

EMS and operational messages | 231


Corrective action Replace the caching module.

iomem.dimm.mismatch.vendor
Message

iomem.dimm.mismatch.vendor

Severity

NODE_ERROR

Description

This message occurs when the caching module driver detects a DIMM with a
manufacturer ID that does not match that of the other installed DIMMs on the
caching module.

Corrective action Replace the caching module.

iomem.dimm.spd.banks
Message

iomem.dimm.spd.banks

Severity

NODE_ERROR

Description

This message occurs when the caching module driver detects a DIMM with a
number of banks incompatible with the memory controller of the caching
module.

Corrective action Replace the caching module.

iomem.dimm.spd.burst
Message

iomem.dimm.spd.burst

Severity

NODE_ERROR

Description

This message occurs when the caching module driver detects a DIMM with a
burst size incompatible with the memory controller of the caching module.

Corrective action Replace the caching module.

iomem.dimm.spd.casLatency
Message

iomem.dimm.spd.casLatency

Severity

NODE_ERROR

Description

This message occurs when the caching module driver detects a DIMM with a
column address select (CAS) latency incompatible with the memory controller
of the caching module

Corrective action Replace the caching module.

232 | Hardware Platform Monitoring Guide

iomem.dimm.spd.checksum
Message

iomem.dimm.spd.checksum

Severity

NODE_ERROR

Description

This message occurs when the caching module driver detects a checksum error
for the identifying information read from the serial presence detect (SPD)
electronically erasable programmable read-only memory (EEPROM) of a
DIMM installed on the caching module.

Corrective action Replace the caching module.

iomem.dimm.spd.columns
Message

iomem.dimm.spd.columns

Severity

NODE_ERROR

Description

This message occurs when the caching module driver detects a DIMM with a
number of columns incompatible with the memory controller of the caching
module.

Corrective action Replace the caching module.

iomem.dimm.spd.dataWidth
Message

iomem.dimm.spd.dataWidth

Severity

NODE_ERROR

Description

This message occurs when the caching module driver detects a DIMM with a
data synchronous dynamic RAM (SDRAM) width incompatible with the
memory controller of the caching module.

Corrective action Replace the caching module.

iomem.dimm.spd.detect
Message

iomem.dimm.spd.detect

Severity

INFO

Description

This message occurs when the caching module driver detects the presence of an
installed DIMM during initialization.

Corrective action None.

EMS and operational messages | 233

iomem.dimm.spd.eccWidth
Note: If you have a 16-GB Performance Acceleration Module, and if more than 10 uncorrectable
ECC memory errors occur per day for three consecutive days, replace the module.

Message

iomem.dimm.spd.eccWidth

Severity

NODE_ERROR

Description

This message occurs when the caching module driver detects a DIMM with an
ECC synchronous dynamic RAM (SDRAM) SDRAM width incompatible with
the memory controller of the caching module.

Corrective action Replace the caching module.

iomem.dimm.spd.ranks
Message

iomem.dimm.spd.ranks

Severity

NODE_ERROR

Description

This message occurs when the acceleration card driver detects a DIMM with a
number of ranks incompatible with the memory controller of the acceleration
card.

Corrective action Replace the acceleration card.

iomem.dimm.spd.read
Message

iomem.dimm.spd.read

Severity

NODE_ERROR

Description

This message occurs when the caching module driver fails to read the
identifying information from the synchronous dynamic RAM (SDRAM)
electronically erasable programmable read-only memory EEPROM of a DIMM
installed on the caching module.

Corrective action Replace the acceleration card.

iomem.dimm.spd.rows
Message

iomem.dimm.spd.rows

Severity

NODE_ERROR

Description

This message occurs when the caching module driver detects a DIMM with a
number of rows incompatible with the memory controller of the caching
module.

234 | Hardware Platform Monitoring Guide


Corrective action Replace the caching module.

iomem.dma.crc.data
Message

iomem.dma.crc.data

Severity

WARNING

Description

This message occurs when the caching module driver detects a data checksum
error for data in transit across the PCI link between the system and the caching
module.

Corrective action Contact technical support.

iomem.dma.crc.desc
Message

iomem.dma.crc.desc

Severity

WARNING

Description

This message occurs when the caching module driver detects a descriptor
checksum error for data in transit across the PCI link between the system and
the caching module.

Corrective action Contact technical support.

iomem.dma.internal
Message

iomem.dma.internal

Severity

WARNING

Description

This message occurs when the caching module driver detects an internal direct
memory access (DMA) error during data transfer.

Corrective action Contact technical support.

iomem.dma.stall
Message

iomem.dma.stall

Severity

WARNING

Description

This message occurs when the acceleration card driver detects a direct memory
access (DMA) channel has unexpectedly stalled and is attempting to restart the
DMA channel for normal operation.

Corrective action None.

EMS and operational messages | 235

iomem.ecc.cecc
Note: If you have a 16-GB Performance Acceleration Module, and if more than 10 uncorrectable
ECC memory errors occur per day for three consecutive days, replace the module.

Message

iomem.ecc.cecc

Severity

WARNING

Description

This message occurs when a correctable ECC memory error is detected while
accessing the memory of a caching module. If frequent, correctable ECC errors
usually indicate that a hardware memory component of the caching module is
failing.

Corrective action None.

iomem.ecc.correct.off
Message

iomem.ecc.correct.off

Severity

WARNING

Description

This message occurs when the error correction code (ECC) memory error
correction has been disabled for a caching module.

Corrective
action

ECC error correction should never be disabled for the caching module under
normal operating conditions. The only way that this can occur is if it has been
explicitly disabled through a private diagnostic interface. If this message is
encountered under normal operating conditions, contact technical support.

iomem.ecc.correct.on
Message

iomem.ecc.correct.on

Severity

INFO

Description

This message occurs when the error correction code (ECC) memory error
correction has been enabled for a caching module.

Corrective action None.

iomem.ecc.detect.off
Message

iomem.ecc.detect.off

Severity

WARNING

Description

This message occurs when the error correction code (ECC) memory error
detection has been disabled for an acceleration card.

236 | Hardware Platform Monitoring Guide


Corrective
action

ECC error detection should never be disabled for the caching module under
normal operating conditions. The only way that this can occur is if the
functionality has been explicitly disabled via a private diagnostic interface. If
this message is encountered under normal operating conditions, contact
technical support.

iomem.ecc.detect.on
Message

iomem.ecc.detect.on

Severity

INFO

Description

This message occurs when the error correction code (ECC) memory error
detection has been enabled for a caching module.

Corrective action None.

iomem.ecc.inject
Message

iomem.ecc.inject

Severity

WARNING

Description

This message occurs when an error correction code (ECC) memory error is
manually injected into the memory of a caching module. This injection event
will only occur during diagnostic testing.

Corrective action None.

iomem.ecc.summary
Message

iomem.ecc.summary

Severity

WARNING

Description

This message occurs when the caching module driver makes its periodic error
summary report indicating that uncorrectable memory errors have been detected
on the acceleration card.

Corrective action Replace the acceleration card.

iomem.ecc.uecc
Message

iomem.ecc.uecc

Severity

NODE_ERROR

Description

This message occurs when an uncorrectable ECC memory error is detected while
accessing the memory of a caching module. Uncorrectable ECC errors indicate

EMS and operational messages | 237


that a hardware memory component of the caching module has failed or is
failing. Uncorrectable memory errors can only be isolated to a pair of DIMMs on
the caching module.
Corrective
action

None.
Note: If you have a 16-GB Performance Acceleration Module, and if more
than 10 uncorrectable ECC memory errors occur per day for three consecutive
days, replace the module.

iomem.fail.stripe
Message

iomem.fail.stripe

Severity

INFO

Description

An erase stripe is being failed.

Corrective action

None.

iomem.firmware.package.access
Message

iomem.firmware.package.access

Severity

NODE_error

Description

This message occurs when the caching module driver encounters a problem
while accessing the firmware package. The caching module might continue to
function, but it is recommended that you follow the corrective action at the
earliest opportunity.

Corrective
action

Reinstall the Data ONTAP software package or service image.

iomem.firmware.primary
Message

iomem.firmware.primary

Severity

WARNING

Description

This message occurs when the caching module driver detects that the card is not
running on the primary firmware image. The card does not function unless it is
running on the primary image.

Corrective action None.

iomem.firmware.program.complete
Message

iomem.firmware.program.complete

238 | Hardware Platform Monitoring Guide


Severity

INFO

Description

This message occurs when the caching module driver finishes the programming
procedure for the caching module firmware.

Corrective action None.

iomem.firmware.program.fail
Message

iomem.firmware.program.fail

Severity

NODE_ERROR

Description

This message occurs when the caching module driver fails to program the card
firmware.

Corrective action Contact technical support.

iomem.firmware.program.reboot
Message

iomem.firmware.program.reboot

Severity

INFO

Description This message occurs when the caching module driver triggers a reboot due to
programming firmware on one or more caching modules.

iomem.firmware.program.start
Message

iomem.firmware.program.start

Severity

INFO

Description

This message occurs when the caching module driver begins the programming
procedure for the module firmware.

Corrective action None.

iomem.firmware.rev
Message

iomem.firmware.rev

Severity

WARNING

Description

This message occurs when the caching module driver detects that the field
programmable gate array (FPGA) firmware image is a revision not supported
by the driver.

Corrective action None.

EMS and operational messages | 239

iomem.flash.mismatch.id
Message

iomem.flash.mismatch.id

Severity

NODE_ERROR

Description

This message occurs when the caching module driver detects a flash device
with an identifier that does not match the identifier contained in the fieldreplaceable unit (FRU) information. The caching module is not functional until
you resolve this issue.

Corrective action Contact technical support.

iomem.fru.badInfo
Message

iomem.fru.badInfo

Severity

WARNING

Description

This message occurs when the caching module driver detects invalid
information in the field-replaceable unit (FRU) electronically erasable
programmable read-only memory (EEPROM) of the caching module.

Corrective action Replace the caching module.

iomem.fru.checksum
Message

iomem.fru.checksum

Severity

WARNING

Description

This message occurs when the caching module driver detects a checksum error
in the card field-replaceable unit (FRU) information for the caching module.

Corrective action Replace the caching module.

iomem.fru.read
Message

iomem.fru.read

Severity

WARNING

Description

This message occurs when the caching module driver encounters an error
reading the field-replaceable unit (FRU) electronically erasable programmable
read-only memory (EEPROM) of the caching module.

Corrective action Replace the caching module.

240 | Hardware Platform Monitoring Guide

iomem.fru.write
Message

iomem.fru.write

Severity

WARNING

Description

This message occurs when the caching module driver encounters an error
writing the field-replaceable unit (FRU) electronically erasable programmable
read-only memory (EEPROM) of the caching module.

Corrective action Replace the caching module.

iomem.i2c.link.down
Message

iomem.i2c.link.down

Severity

WARNING

Description

This message occurs when the caching module driver detects the failure of
Inter-Integrated Circuit (I2C) serial link on the caching module.

Corrective action Replace the caching module.

iomem.i2c.read.addrNACK
Message

iomem.i2c.read.addrNACK

Severity

WARNING

Description

This message occurs when the caching module driver detects an address
negative acknowledgment (NACK) error condition when reading data from an
Inter-Integrated Circuit (I2C) device on the caching module.

Corrective action Replace the caching module.

iomem.i2c.read.dataNACK
Message

iomem.i2c.read.dataNACK

Severity

WARNING

Description

This message occurs when the caching module driver detects a data negative
acknowledgment (NACK) error condition when reading data from an InterIntegrated Circuit (I2C) device on the caching module.

Corrective action Replace the caching module.

EMS and operational messages | 241

iomem.i2c.read.timeout
Message

iomem.i2c.read.timeout

Severity

WARNING

Description

This message occurs when the caching module driver times out while trying to
read data from an Inter-Integrated Circuit (I2C) device on the caching module.

Corrective action Replace the caching module.

iomem.i2c.write.addrNACK
Message

iomem.i2c.write.addrNACK

Severity

WARNING

Description

This message occurs when the caching module driver detects an address
negative acknowledgment (NACK) error condition when writing data from an
Inter-Integrated Circuit (I2C) device on the caching module.

Corrective action Replace the caching module.

iomem.i2c.write.dataNACK
Message

iomem.i2c.write.dataNACK

Severity

WARNING

Description

This message occurs when the caching module driver detects a data negative
acknowledgment (NACK) error condition when writing data from an InterIntegrated Circuit (I2C) device on the caching module.

Corrective action Replace the caching module.

iomem.i2c.write.timeout
Message

iomem.i2c.write.timeout

Severity

WARNING

Description

This message occurs when the caching module driver times out while trying to
write data from an Inter-Integrated Circuit (I2C) device on the caching module.

Corrective action Replace the caching module.

iomem.init.detect.fpga
Message

iomem.init.detect.fpga

242 | Hardware Platform Monitoring Guide


Severity

INFO

Description

This message occurs when the field-programmable gate array (FPGA) on a


caching module is detected and initialized for use by the driver.

Corrective action None.

iomem.init.detect.pci
Message

iomem.init.detect.pci

Severity

INFO

Description

This message occurs when a caching module is detected in a PCI slot and is
being initialized for use by the system.

Corrective action None.

iomem.init.fail
Message

iomem.init.fail

Severity

NODE_ERROR

Description

This message occurs when the caching module driver fails to initialize a
caching module.

Corrective action Look for the specific failure log messages in the EMS log prior to this message;
they identify the reason for the failure.

iomem.memory.flash.syndrome
Message

iomem.memory.flash.syndrome

Severity

DEBUG

Description

This messages occurs when the caching module driver detects a syndrome code
associated with a flash memory access.

Corrective action None.

iomem.memory.none
Message

iomem.memory.none

Severity

NODE_ERROR

Description

This message occurs when the caching module driver cannot detect any
installed memory on a caching module.

Corrective action Replace the caching module.

EMS and operational messages | 243

iomem.memory.power.high
Message

iomem.memory.power.high

Severity

WARNING

Description

This message occurs when the memory of the caching module has been
configured to operate in high power mode.

Corrective
action

Memory high power mode should never be enabled for the caching module
under normal operating conditions. The only way that this can occur is if it has
been explicitly enabled via a private diagnostic interface. If this message is
encountered under normal operating conditions, contact technical support.

iomem.memory.power.low
Message

iomem.memory.power.low

Severity

INFO

Description

This message occurs when the memory DIMMs of the caching module have
been configured to operate in low power mode.

Corrective action None.

iomem.memory.scrub.start
Message

iomem.memory.scrub.start

Severity

INFO

Description

This message occurs when the background error correction code (ECC)
memory scrubbing process on a caching module is starting.

Corrective action None.

iomem.memory.size
Message

iomem.memory.size

Severity

INFO

Description

This message occurs when the caching module driver has determined the
amount of memory installed on a caching module.

Corrective action None.

244 | Hardware Platform Monitoring Guide

iomem.memory.zero.complete
Message

iomem.memory.zero.complete

Severity

INFO

Description

This message occurs when the boot-time zeroing of the memory of a caching
module is complete.

Corrective action None.

iomem.memory.zero.start
Message

iomem.memory.zero.start

Severity

INFO

Description

This message occurs when the boot-time zeroing of the memory of a caching
module is starting.

Corrective action None.

iomem.nor.op.failed
Message

iomem.nor.op.failed

Severity

NODE_ERROR

Description

This message occurs when the caching module driver detects that an operation
to a NOR flash memory has failed.

Corrective action None.

iomem.pci.error.config.bar
Message

iomem.pci.error.config.bar

Severity

NODE_ERROR

Description

This message occurs when the caching module driver detects a misconfigured
Base Address Register (BAR) on the caching hardware.

Corrective
action

Boot into diagnostics and use the applicable menu option to reprogram the
primary field-programmable gate array (FPGA) image on the caching module.
If the problem persists, replace the caching module.

iomem.pio.op.failed
Message

iomem.pio.op.failed

EMS and operational messages | 245


Severity

NODE_ERROR

Description

This message occurs when the caching module driver detects that a
programmed I/O (PIO) NAND flash access failed.

Corrective action None.

iomem.remap.block
Message

iomem.remap.block

Severity

INFO

Description

This message occurs when a bad erase block is being remapped to a spare
block.

Corrective action None.

iomem.remap.target.bad
Message

iomem.remap.target.bad

Severity

INFO

Description

This message occurs when the target of a remap is found to be bad.

Corrective action

None.

iomem.temp.report
Message

iomem.temp.report

Severity

INFO

Description

This message occurs periodically to report the operating temperature of the


field-programmable gate array (FPGA) on the caching module.

Corrective action None.

iomem.train.complete
Message

iomem.train.complete

Severity

INFO

Description

This message occurs when the caching module driver has successfully trained
one of the memory controllers for a memory DIMM bank to report the
calibrated idelay setting.

Corrective action None.

246 | Hardware Platform Monitoring Guide

iomem.train.fail
Message

iomem.train.fail

Severity

NODE_ERROR

Description

This message occurs when the caching module driver detects that the card
memory controllers have failed to train for the installed DIMMs.

Corrective action Replace the caching module.

iomem.train.notReady
Message

iomem.train.notReady

Severity

NODE_ERROR

Description

This message occurs when the caching module driver detects that a caching
module memory controller has failed to become ready for operation after
calibration.

Corrective action Replace the caching module.

iomem.train.start
Message

iomem.train.start

Severity

INFO

Description

This message occurs when the caching module driver initiates training of the
memory controllers on the acceleration card to calibrate them to the installed
memory modules.

Corrective action None.

iomem.vmargin.high
Message

iomem.vmargin.high

Severity

WARNING

Description

This message occurs when the acceleration card driver has been configured to
margin a voltage level high for testing purposes.

Corrective action None.

iomem.vmargin.low
Message

iomem.vmargin.low

EMS and operational messages | 247


Severity

WARNING

Description

This message occurs when the caching module driver has been configured to
margin a voltage level low for testing purposes.

Corrective action None.

iomem.vmargin.nominal
Message

iomem.vmargin.nominal

Severity

INFO

Description

This message occurs when voltage margining has been returned to nominal
level on the caching module.

Corrective action None.

monitor.extCache.failed
Message

monitor.extCache.failed

Severity

LOG_WARNING

Description

This message occurs if the monitor detects the Write Anywhere File Layout
(WAFL) external cache subsystem (FlexScale) has failed and is no longer
available for use.

Corrective action Consult the system logs to determine the original cause of the error.

monitor.flexscale.noLicense
Message

monitor.flexscale.noLicense

Severity

INFO

Description

This message occurs if the monitor detects that the caching module is present
but the FlexScale product is not licensed. FlexScale requires a license for use.

Corrective action Obtain a license for the FlexScale product, or remove the caching module.

SAS EMS messages


SAS EMS messages inform you of events and problems involving your system SAS disk drives.

ds.sas.config.warning
Message

ds.sas.config.warning

248 | Hardware Platform Monitoring Guide


Severity

WARNING

Description

This message occurs when the system detects a configuration problem on the
shelf I/O module.

Corrective action 1. Reseat the disk shelf I/O module.


2. If that does not fix the problem, replace the disk shelf I/O module.
SNMP trap ID

N/A

ds.sas.crc.err
Message

ds.sas.crc.err

Severity

DEBUG

Description

This message occurs when a serial-attached SCSI (SAS) cyclic redundancy


check (CRC) error is detected.

Corrective action N/A


SNMP trap ID

N/A

ds.sas.drivephy.disableErr
Message

ds.sas.drivephy.disableErr

Severity

ERR

Description

This message occurs when a physical layer device (PHY) on a serial-attached


SCSI (SAS) I/O module is disabled because of one of the following reasons:

Manually bypassed
Exceeded loss of double word synchronization threshold
Exceeded running disparity threshold transmitter fault
Exceeded cyclic redundancy check (CRC) error threshold
Exceeded invalid double word threshold
Exceeded PHY reset problem threshold
Exceeded broadcast change threshold
Mirroring disabled on the other I/O module

Corrective action Replace the disabled disk drive.


SNMP Trap ID

#574

ds.sas.element.fault
Message

ds.sas.element.fault

EMS and operational messages | 249


Severity

ERR

Description

This message indicates a transport error.

Corrective action 1. Check cabling to the disk shelf.


2. Check the status LED on the disk shelf and make sure that fault LEDs are
not on.
3. Clear any fault condition, if possible.
4. See the quick reference card beneath the disk shelf for information about the
meanings of the LEDs.
SNMP trap ID

N/A

ds.sas.element.xport.error
Message

ds.sas.element.xport.error

Severity

ERR

Description

This message indicates a transport error.

Corrective action 1. Check cabling to the disk shelf.


2. Check the status LED on the disk shelf and make sure that fault LEDs are
not on.
3. Clear any fault condition, if possible
4. See the quick reference card beneath the disk shelf for information about the
meanings of the LEDs.
SNMP trap ID

N/A

ds.sas.hostphy.disableErr
Message

ds.sas.hostphy.disableErr

Severity

ERR

Description

This message occurs when a host physical layer device (PHY) on a serialattached SCSI (SAS) I/O module is disabled because of one of the following
reasons:

Manually bypassed
Exceeded loss of double word synchronization threshold
Exceeded running disparity threshold Transmitter fault
Exceeded cyclic redundancy check (CRC) error threshold

250 | Hardware Platform Monitoring Guide

Exceeded invalid double word threshold


Exceeded PHY reset problem threshold
Exceeded broadcast change threshold
Mirroring disabled on the other I/O module

Corrective action Replace the disk shelf module to which the host physical layer device belongs.
SNMP trap ID

N/A

ds.sas.invalid.word
Message

ds.sas.invalid.word

Severity

DEBUG

Description

This message occurs when a serial-attached SCSI (SAS) word error is detected
in a SAS primitive. These errors can be caused by the disk drive, the cable, the
host bus adapter (HBA), or the shelf I/O module.

Corrective action The SAS specification allows for a certain bit error rate so that these errors can
occur. There is nothing to be alarmed about if these individual errors show up
occasionally.
SNMP trap ID

N/A

ds.sas.loss.dword
Message

ds.sas.loss.dword

Severity

DEBUG

Description

This message occurs when a serial-attached SCSI (SAS) loss of double word
synchronization error is detected in a SAS primitive.

Corrective action N/A


SNMP trap ID

N/A

ds.sas.multPhys.disableErr
Message

ds.sas.multPhys.disableErr

Severity

ERR

Description

This message occurs when physical layer devices (PHYs) are disabled on
multiple disk drives in a serial-attached SCSI (SAS) disk shelf.

Corrective action 1. Check whether the problems on the physical layer devices are valid.

EMS and operational messages | 251


2. If multiple physical layer devices are disabled at the same time, replace the
disk shelf module.
SNMP trap ID

N/A

ds.sas.phyRstProb
Message

ds.sas.phyRstProb

Severity

DEBUG

Description

This message occurs when a serial-attached SCSI (SAS) physical layer device
(PHY) reset error is detected in a SAS primitive.

Corrective action N/A


SNMP trap ID

N/A

ds.sas.running.disparity
Message

ds.sas.running.disparity

Severity

DEBUG

Description

This message occurs when a serial-attached SCSI (SAS) running disparity error
is detected in a SAS primitive. These errors are caused when the number of
logical 1s and 0s are too much out of sync.

Corrective action N/A


SNMP trap ID

N/A

ds.sas.ses.disableErr
Message

ds.sas.ses.disableErr

Severity

NODE_ERROR

Description

This message occurs when a virtual SCSI Enclosure Services (SES) physical
layer device (PHY) on a serial-attached SCSI (SAS) I/O module is disabled due
to one of the following reasons:

Manually bypassed
Exceeded loss of double word synchronization threshold
Exceeded running disparity threshold Transmitter fault
Exceeded cyclic redundancy check (CRC) error threshold
Exceeded invalid double word threshold
Exceeded PHY reset problem threshold
Exceeded broadcast change threshold

252 | Hardware Platform Monitoring Guide


Corrective action Replace the shelf module to which the concerned SES physical layer device
belongs.
SNMP trap ID

N/A

ds.sas.xfer.element.fault
Message

ds.sas.xfer.element.fault

Severity

ERR

Description

This message indicates that an element had a fault during an I/O request. It
might be because of a transient condition in link connectivity.

Corrective action 1. Check cabling to the shelf.


2. Check the status LED on the shelf, and make sure that fault LEDs are not
on.
3. Clear any fault condition, if possible.
4. See the quick reference card beneath the shelf for information about the
meanings of the LEDs.
SNMP trap ID

N/A

ds.sas.xfer.export.error
Message

ds.sas.xfer.export.error

Severity

ERR

Description

This message indicates a transport error during an I/O request. It might be due
to a transient condition in link activity.

Corrective action 1. Check cabling to the shelf.


2. Check cabling to the shelf.
3. Clear any fault condition, if possible.
4. See the quick reference card beneath the shelf for information about the
meanings of the LEDs.
SNMP trap ID

N/A

ds.sas.xfer.not.sent
Message

ds.sas.xfer.not.sent

Severity

ERR

EMS and operational messages | 253


Description

This message indicates that an I/O transfer could not be sent. It might be
because of a transient condition in link connectivity.

Corrective action 1. Check cabling to the shelf.


2. Check the status LED on the shelf, and make sure that fault LEDs are not
on.
3. Clear any fault condition, if possible.
4. See the quick reference card beneath the shelf for information about the
meanings of the LEDs.
SNMP trap ID

N/A

ds.sas.xfer.unknown.error
Message

ds.sas.xfer.unknown.error

Severity

ERR

Description

This message indicates that an unknown error occurred during an I/O request.

Corrective action N/A


SNMP trap ID

N/A

sas.adapter.bad
Message

sas.adapter.bad

Severity

ALERT

Description

This message occurs when the serial-attached SCSI (SAS) adapter fails to
initialize.

Corrective action 1. Reseat the adapter.


2. If reseating the adapter failed to help, replace the adapter.
SNMP trap ID

N/A

sas.adapter.bootarg.option
Message

sas.adapter.bootarg.option

Severity

INFO

Description

The serial-attached SCSI (SAS) adapter driver is setting an option based on the
setting of a bootarg/environment variable.

Corrective action None

254 | Hardware Platform Monitoring Guide


N/A

SNMP trap ID

sas.adapter.debug
Message

sas.adapter.debug

Severity

INFO

Description

This message occurs during the serial-attached SCSI (SAS) adapter driver
debug event.

Corrective action None


SNMP trap ID

N/A

sas.adapter.exception
Message

sas.adapter.exception

Severity

WARNING

Description

This message occurs when the serial-attached SCSI (SAS) adapter driver
encounters an error with the adapter. The adapter is reset to recover.

Corrective action None.


SNMP trap ID

N/A

sas.adapter.failed
Message

sas.adapter.failed

Severity

ERR

Description

This message occurs when the serial-attached SCSI (SAS) adapter driver
cannot recover the adapter after resetting it multiple times. The adapter is put
offline.

Corrective action 1. If the adapter is in use, check the cabling.


2. If connected to disk shelves, check the seating of IOM cards and disks.
3. If the problem persists, try replacing the adapter.
4. If the issue is still not resolved, contact technical support.
SNMP trap ID

N/A

sas.adapter.firmware.download
Message

sas.adapter.firmware.download

EMS and operational messages | 255


Severity

INFO

Description

This message occurs when firmware is being updated on the serial-attached


SCSI (SAS) adapter.

Corrective action None.


SNMP trap ID

N/A

sas.adapter.firmware.fault
Message

sas.adapter.firmware.fault

Severity

WARNING

Description

This message occurs when a firmware fault is detected on the serial-attached


SCSI (SAS) adapter and it is being reset to recover.

Corrective action None.


SNMP trap ID

N/A

sas.adapter.firmware.update.failed
Message

sas.adapter.firmware.update.failed

Severity

CRIT

Description

This message occurs when firmware on the serial-attached SCSI (SAS) adapter
cannot be updated.

Corrective action Replace the adapter as soon as possible. The SAS adapter driver attempts to
continue using the adapter without updating the firmware image.
SNMP trap ID

N/A

sas.adapter.not.ready
Message

sas.adapter.not.ready

Severity

ERR

Description

This message occurs when the serial-attached SCSI (SAS) adapter does not
become ready after being reset.

Corrective action The SAS adapter driver automatically attempts to recover from this error. If the
error keeps occurring, the adapter might need to be replaced.
SNMP trap ID

N/A

256 | Hardware Platform Monitoring Guide

sas.adapter.offline
Message

sas.adapter.offline

Severity

INFO

Description

This message indicates the name of the associated serial-attached SCSI (SAS)
host bus adapter (HBA).

Corrective action None.


SNMP trap ID

N/A

sas.adapter.offlining
Message

sas.adapter.offlining

Severity

INFO

Description

This message occurs when the serial-attached SCSI (SAS) adapter is going
offline after all outstanding I/O requests have finished.

Corrective action None.


SNMP trap ID

N/A

sas.adapter.online
Message

sas.adapter.online

Severity

INFO

Description

This message indicates that the serial-attached SCSI (SAS) adapter is now
online.

Corrective action None.


SNMP trap ID

N/A

sas.adapter.online.failed
Message

sas.adapter.online.failed

Severity

LOG_ERR

Description

This message indicates the name of the associated serial-attached SCSI (SAS)
host bus adapter (HBA).

Corrective action 1. If the HBA is in use, check the cabling.


2. If the HBA is connected to disk shelves, check the seating of IOM cards.

EMS and operational messages | 257


SNMP trap ID

N/A

sas.adapter.onlining
Message

sas.adapter.onlining

Severity

INFO

Description

This message indicates that the serial-attached SCSI (SAS) adapter is in the
process of going online.

Corrective action None.


SNMP trap ID

N/A

sas.adapter.reset
Message

sas.adapter.reset

Severity

INFO

Description

This message occurs when the Data ONTAP serial-attached SCSI (SAS) driver
is resetting the specified HBA. This can occur during normal error handling or
by user request.

Corrective action None.


SNMP trap ID

N/A

sas.adapter.unexpected.status
Message

sas.adapter.unexpected.status

Severity

WARNING

Description

This message occurs when the serial-attached SCSI (SAS) adapter returns an
unexpected status and is reset to recover.

Corrective action None.


SNMP trap ID

N/A

sas.cable.error
Message

sas.cable.error

Severity

WARNING

Description

Failure to retrieve information about cable attached to the serial-attached SCSI


(SAS) adapter port occurred.

258 | Hardware Platform Monitoring Guide


Corrective action None.
SNMP trap ID

N/A

sas.cable.pulled
Message

sas.cable.pulled

Severity

INFO

Description

The cable attached to the serial-attached SCSI (SAS) adapter port was pulled
out.

Corrective action None.


SNMP trap ID

N/A

sas.cable.pushed
Message

sas.cable.pushed

Severity

INFO

Description

The cable attached to the serial-attached SCSI (SAS) adapter port was pushed
in.

Corrective action None.


SNMP trap ID

N/A

sas.config.mixed.detected
Message

sas.config.mixed.detected

Severity

WARNING

Description

This message occurs when a serial-attached SCSI (SAS) disk shelf contains a
mixture of SAS drives, serial advanced technology attachment (SATA) drives
or bridged SAS drives. Mixing drive types within a disk shelf is not supported.

Corrective action Ensure that each SAS disk shelf is populated with drives of only one type.
SNMP trap ID

N/A

sas.device.invalid.wwn
Message

sas.device.invalid.wwn

Severity

ERR

EMS and operational messages | 259


Description

This message occurs when the serial-attached SCSI (SAS) device responds with
an invalid worldwide name.

Corrective action Power-cycling the device might allow it to recover from this problem.
SNMP trap ID

N/A

sas.device.quiesce
Message

sas.device.quiesce

Severity

INFO

Description

This message indicates that at least one command to the specified device has not
completed in the normally expected time. In this case, the driver stops sending
additional commands to the device until all outstanding commands have had an
opportunity to be completed. This condition is automatically handled by the
Data ONTAP serial-attached SCSI (SAS) driver.

Corrective
action

This condition by itself does not mean that the target device is problematic. High
workloads might cause link saturation leading to device contention for the bus.
Transport issues might also cause link throughput to decrease, thereby causing
I/Os to take longer than normal.
If you see this message only on occasion, no action is required. The system
handles the condition automatically.

SNMP trap ID

N/A

sas.device.resetting
Message

sas.device.resetting

Severity

WARNING

Description

This message indicates device level error recovery has escalated to resetting the
device. It is usually seen in association with error conditions such as device
level timeouts or transmission errors.
This message reports the recovery action taken by the Data ONTAP serialattached SCSI (SAS) driver when evaluating associated device-related or linkrelated error conditions.

Corrective action None.


SNMP trap ID

N/A

260 | Hardware Platform Monitoring Guide

sas.device.timeout
Message

sas.device.timeout

Severity

ERR

Description

This message occurs when not all outstanding commands to the specified device
were completed within the allotted time. As part of the standard error handling
sequence managed by the Data ONTAP serial-attached SCSI (SAS) driver, all
commands to the device are aborted and reissued.

Corrective
action

Device level timeouts are a common indication of a SAS link stability problem.
In some cases, the link is operating normally and the specified device is having
trouble processing I/O requests in a timely manner. In such cases, the specified
device should be evaluated for possible replacement.
Quite often the problem results from the partial failure of a component involved
in the SAS transport. Common things to check include the following:

SNMP trap ID

Complete seating of drive carriers in enclosure bays


Properly secured cable connections
IOM seating
Crimped or otherwise damaged cables

N/A

sas.initialization.failed
Message

sas.initialization.failed

Severity

ERR

Description

This message occurs when the serial-attached SCSI (SAS) adapter fails to
initialize the link and appears to be unattached or disconnected.

Corrective action 1. If the adapter is in use, check the cabling.


2. If the adapter is connected to disk shelves, check the seating of IOM cards.
SNMP trap ID

N/A

sas.link.error
Message

sas.link.error

Severity

ERR

Description

This message occurs when the serial-attached SCSI (SAS) adapter cannot
recover the link and is going offline.

EMS and operational messages | 261


Corrective action 1. If the adapter is in use, check the cabling.
2. If the adapter is connected to disk shelves, check the seating of IOM cards
and disks.
3. If this does not resolve the issue, contact technical support.
SNMP trap ID

N/A

sas.port.disabled
Message

sas.port.disabled

Severity

WARNING

Description

The serial-attached SCSI (SAS) adapter port went down by virtue of being
disabled by the operator.

Corrective action None.


SNMP trap ID

N//A

sas.port.down
Message

sas.port.down

Severity

WARNING

Description

The serial-attached SCSI (SAS) adapter port went down through no action by
the operator.

Corrective action None.


SNMP trap ID

N/A

sas.shelf.conflict
Message

sas.shelf.conflict

Severity

ERR

Description

This message occurs when the system detects that two or more SAS (Serial
Attached SCSI) disk shelves have the same shelf ID. The SAS domain is
functional, but references to disk shelves will be based on disk shelf serial
numbers, not disk shelf IDs.

Corrective action Reassign disk shelf IDs so that no conflict exists.


SNMP trap ID

N/A

262 | Hardware Platform Monitoring Guide

sasmon.adapter.phy.disable
Message

sasmon.adapter.phy.disable

Severity

ERR

Description

This message occurs when a serial attached serial-attached SCSI (SAS)


transceiver (physical layer device) attached to a SAS host bus adapter (HBA) is
disabled due to one of the following reasons:

Corrective
action

Exceeded loss of double word synchronization error threshold


Exceeded running disparity error threshold
Exceeded invalid double word error threshold
Exceeded physical layer device reset problem threshold
Exceeded broadcast change threshold

1. If the adapter is in use, check the cabling.


2. If the adapter is connected to the disk shelves, check the seating of the IOM
cards.
3. If that does not fix the problem, contact technical support.

SNMP trap ID

N/A

sasmon.adapter.phy.event
Message

sasmon.adapter.phy.event

Severity

DEBUG

Description

This message occurs when a serial attached serial-attached SCSI (SAS)


transceiver (physical layer device) attached to a SAS host bus adapter (HBA)
experiences a transient error. These errors are observed on a received double
word (dword) or when resetting a PHY.
Types of these errors are disparity errors, invalid dword errors, physical layer
device (PHY) reset problem errors, loss of dword synchronization errors, and
PHY change events. The SAS specification allows for a certain bit error rate so
that these errors can occur under normal operating conditions.
There is no cause for concern if these individual errors show up occasionally.

Corrective
action

None.

SNMP trap ID

N/A

EMS and operational messages | 263

sasmon.disable.module
Message

sasmon.disable.module

Severity

INFO

Description

This message occurs when the Data ONTAP module responsible for monitoring
the serial attached serial-attached SCSI (SAS) domains transient errors is
disabled due to the environment variable disable-sasmon? being set to
true.

Corrective action Set the environment variable disable-sasmon? to false to enable this
monitor module.
SNMP trap ID

N/A

shm.threshold.spareBlocksConsumed
Message

shm.threshold.spareBlocksConsumed

Severity

NOTICE

Description

This message occurs when the spares consumed value exceeds the first
threshold on an SSD.

Corrective action None.

shm.threshold.spareBlocksConsumedMax
Message

shm.threshold.spareBlocksConsumedMax

Severity

WARNING

Description

This messages occurs when the spares consumed value exceeds the second
threshold on an SSD.

Corrective action None.

SES EMS messages


SES messages appear in AutoSupport messages if failures or warning conditions occur in your
systems storage components.

ses.access.noEnclServ
Message

ses.access.noEnclServ

264 | Hardware Platform Monitoring Guide


Severity

NODE_ERROR

Description

This message occurs when SCSI Enclosure Services (SES) in the storage system
cannot establish contact with the enclosure monitoring process in any disk shelf on
the channel. Some disk shelves require that disks be installed and functioning in
particular shelf bays.

Corrective
action

Note: This message applies to DS14mk2 or DS14mk4 disk shelves that are not
AT-type shelves. DS14mk4 FC disk shelves are used in this message as an
example.

1. In disk shelves that require certain disk placement, verify that disks are
installed in the indicated bays: DS14mk4 FC: bays 0 and/or 1
Note: SCSI-based shelves, serial-attached SCSI (SAS) shelves, and
DS14mk2 AT shelves do not rely on disk placement for SES.

SES in the storage system tries periodically to reestablish contact with the disk
shelf.
2. If disks are placed correctly but the error persists for more than an hour, halt the
storage system, power-cycle the disk shelf, and reboot.
3. If the error persists, then SES hardware (for example, VEM, LRC, or IOM)
might need to be replaced; in SCSI-based shelves, replace the shelf.

ses.access.noMoreValidPaths
Message

ses.access.noMoreValidPaths

Severity

NODE_ERROR

Description

This message occurs when SCSI Enclosure Services (SES) in the storage system
loses access to the enclosure monitoring process in the disk shelf. Some disk
shelves require that disks be installed and functioning in particular shelf bays.

Corrective
action

Note: This message applies to DS14mk2 or DS14mk4 disk shelves that are not
AT-type shelves. DS14mk4 is used in this message as an example

1. This message occurs when SES in the storage system loses access to the
enclosure monitoring process in the disk shelf.
Some disk shelves require that disks be installed and functioning in particular
shelf bays: DS14mk4 FC: bays 0 and/or 1
Note: SCSI-based shelves, serial-attached SCSI (SAS) shelves, and
DS14mk2 AT shelves do not rely on disk placement for SES.

SES in the storage system tries periodically to reestablish contact with the
disk shelf.

EMS and operational messages | 265


2. If disks are placed correctly but the error persists for more than an hour, halt the
storage system, power-cycle the disk shelf, and reboot.
3. If the error persists, then SES hardware (for example, VEM, LRC, or IOM)
might need to be replaced.
In SCSI-based shelves, replace the shelf.

ses.access.noShelfSES
Message

ses.access.noShelfSES

Severity

NODE_ERROR

Description

This message occurs when SCSI Enclosure Services (SES) in the storage system
cannot establish contact with the SES process in the indicated disk shelf. Some
disk shelves require that disks be installed and functioning in particular disk shelf
bays.

Corrective
action

Note: This message applies to DS14mk2 or DS14mk4 disk shelves that are not
AT-type shelves. DS14mk4 is used in this message as an example.

1. In disk shelves that require certain disk placement, verify that disks are
installed in the indicated bays:
DS14mk4 FC: bays 0 and/or 1
Note: SCSI-based shelves, serial-attached SCSI (SAS) shelves, and

DS14mk2 AT shelves do not rely on disk placement for SES.


SES in the storage system tries periodically to reestablish contact with the
disk shelf.
2. If disks are placed correctly but the error persists for more than an hour, halt the
storage system, power-cycle the disk shelf, and reboot.
3. If the error persists, then SES hardware (for example, VEM, LRC, or IOM)
might need to be replaced.
In SCSI-based shelves, replace the shelf.

ses.access.sesUnavailable
Message

ses.access.sesUnavailable

Severity

NODE_ERROR

Description

This message occurs when SCSI Enclosure Services (SES) in the storage system
cannot establish contact with the enclosure monitoring process in one or more disk

266 | Hardware Platform Monitoring Guide


shelves on the channel. Some disk shelves require that disks be installed and
functioning in particular disk shelf bays.
Corrective
action

Note: This message applies to DS14mk2 or DS14mk4 disk shelves that are not
AT-type shelves. DS14mk4 is used in this message as an example.

1. In disk shelves that require certain disk placements, verify that disks are
installed in the indicated bays:
DS14mk4 FC
: bays 0 and/or 1
Note: SCSI-based shelves, serial-attached SCSI (SAS) shelves, and
DS14mk2 AT shelves do not rely on disk placement for SES.

SES in the storage system tries periodically to reestablish contact with the disk
shelf.
2. If disks are placed correctly but the error persists for more than an hour, halt the
storage system, power-cycle the disk shelf, and reboot.
3. If the error persists, then SES hardware (for example, VEM, LRC, or IOM)
might need to be replaced. In SCSI-based shelves, replace the shelf.

ses.badShareStorageConfigErr
Message

ses.badShareStorageConfigErr

Severity

NODE_ERROR

Description

This message occurs when a disk shelf module that is not supported in a
SharedStorage system, such as an LRC module, is detected in a SharedStorage
system.

Corrective action Replace the unsupported module with one that is supported, such as an ESH,
ESH2, or AT-FCX module.

ses.bridge.fw.getFailWarn
Message

ses.bridge.fw.getFailWarn

Severity

WARNING

Description

This message occurs when the bridge firmware revision cannot be obtained.

Corrective action Check the connection to the bank of Maxtor drives.

ses.bridge.fw.mmErr
Message

ses.bridge.fw.mmErr

EMS and operational messages | 267


Severity

SVC_ERROR

Description

This message occurs when the bridge firmware revision is inconsistent.

Corrective action Check the firmware revision number and make sure that they are consistent.
You might have to update the firmware.

ses.channel.rescanInitiated
Message

ses.channel.rescanInitiated

Severity

INFO

Description

This message identifies the name of the adapter port or switch port being
rescanned; for example, 7a or myswitch:5.

Corrective action None.

ses.config.drivePopError
Message

ses.config.drivePopError

Severity

WARNING

Description

This message occurs when the channel has more disk drives on it than are
allowed.
Systems using synchronous mirroring allow more disk drives per channel than
other systems.

Corrective
action

Your action depends on whether you intend to use synchronous mirroring.

If you intend to use synchronous mirroring, make sure that the license is
installed.
If you do not intend to use synchronous mirroring, reduce the number of disk
drives on the channel to no more than the maximum allowed.

ses.config.IllegalEsh270
Message

ses.config.IllegalEsh270

Severity

NODE_ERROR

Description

This message occurs when Data ONTAP detects one or more ESH disk shelf
modules in a disk shelf that is attached to a FAS270 system. This is not a
supported configuration.

Corrective action Replace the ESH modules with ESH2 modules.

268 | Hardware Platform Monitoring Guide

ses.config.shelfMixError
Message

ses.config.shelfMixError

Severity

NODE_ERROR

Description

This message occurs when the channel has a mixture of ATA and Fibre Channel
disk shelves; this is not a supported configuration.

Corrective
action

Mixed-mode operation of ATA and Fibre Channel disks on the system is only
supported on separate loops. Move all Fibre Channel-based disk shelves to one
loop and place all Fibre Channel-to-ATA-based disk shelves on another loop.

ses.config.shelfPopError
Message

ses.config.shelfPopError

Severity

NODE_ERROR

Description

This message occurs when the channel has more shelves on it than are allowed.

Corrective action Reduce the number of disk shelves on the channel to the number specified.

ses.disk.configOk
Message

ses.disk.configOk

Severity

INFO

Description

This message occurs when there are no longer any drives in FAS2050 or SA200
system slots between 20 and 23.

Corrective action None.

ses.disk.illegalConfigWarn
Message

ses.disk.illegalConfigWarn

Severity

WARNING

Description

This message occurs when disk drives are inserted into the bottom row of a
FAS2050 or an SA200 system. Disk drives are not supported in those slots.

Corrective action None.

ses.disk.pctl.timeout
Message

ses.disk.pctl.timeout

Severity

DEBUG

EMS and operational messages | 269


Description

This message occurs when a power control request submitted to the specified
SCSI Enclosure Services (SES) module is not completed within 60 seconds.

Corrective
action

Normally, there is no corrective action required for this error because the
timeout might be due to a transient error. However, if you see this message
frequently, there might be an issue with the I/O module in the shelf, which
might need to be replaced.

ses.download.powerCyclingChannel
Message

ses.download.powerCyclingChannel

Severity

INFO

Description

This message occurs when the power-cycling channel event is issued after a
disk shelf firmware download to disk shelves that require a power-cycle to
activate the new code.

Corrective action None.

ses.download.shelfToReboot
Message

ses.download.shelfToReboot

Severity

INFO

Description

This message occurs after the completion of shelf firmware transfer to the
DS14mk2 AT disk shelf. At this point, the disk shelf requires about another five
minutes to transfer the new firmware to its nonvolatile program memory,
whereupon it reboots to begin to execute the new firmware. During this reboot,
an FC loop reinitialization occurs, temporarily interrupting the loop.

Corrective
action

None.

ses.download.suspendIOForPowerCycle
Message

ses.download.suspendIOForPowerCycle

Severity

INFO

Description

This message occurs when the suspending I/O event signals that the storage
subsystem is temporarily stopping I/O to disks while one or more disk shelves
have their power cycled after a download, if required by the disk shelf design.

Corrective
action

None.

270 | Hardware Platform Monitoring Guide

ses.drive.PossShelfAddr
Message

ses.drive.PossShelfAddr

Severity

WARNING

Description

This message occurs in conjunction with the message ses.drive.shelfAddr.mm


when there are devices that have apparently taken a wrong address; the adapter
shows device addresses that SCSI Enclosure Services (SES) indicates should not
exist, and vice versa.
This error is not a fatal condition. It means that SES cannot perform certain
operations on the affected disk drives, such as setting failure LEDs, because it is not
certain which disk shelf the affected disk drive is in.

Corrective
action

1. If the problem is throughout the disk shelf, replace the disk shelf.
2. If the error is only one disk drive per disk shelf, the drive might have taken an
incorrect address at power-on.
3. Arrange to make this disk drive a spare, and then reseat it to cause it to take its
address again.
4. If the problem persists, insert a different spare disk drive into the slot. If the
error then clears, replace the original disk drive.
5. If the problem persists, there is a hardware problem with the individual disk
bay. Replace the disk shelf.

ses.drive.shelfAddr.mm
Message

ses.drive.shelfAddr.mm

Severity

NODE_ERROR

Description

This message occurs when there is a mismatch between the position of the drives
detected by the disk shelf and the address of the drives detected by the FC loop or
SCSI bus.
This error indicates that a disk drive took an address other than what the disk shelf
should have provided, or that SCSI Enclosure Services (SES) in a disk shelf cannot
be contacted for address information, or that a disk drive unexpectedly does not
participate in device discovery on the loop or bus.
If the message EMS_ses_drive_possShelfAddr subsequently appears, follow
the corrective actions in that message.
In this condition, the SES process in the system might be unable to perform certain
operations on the disk, such as setting failure LEDs or detecting disk swaps.

EMS and operational messages | 271


Corrective
action

Note: This message applies to DS14mk2 or DS14mk4 disk shelves that are not
AT-type shelves. DS14mk4 is used in this message as an example.

1. If this occurs to multiple disk drives on the same loop, check the I/O modules at
the back of the disk shelves on that loop for errors.
2. In disk shelves that require certain disk placement, verify that disks are installed
in the indicated bays:
DS14mk4 FC: bays 0 and/or 1
Note: SCSI-based disk shelves and DS14mk2 AT disk shelves do not rely on
disk placement for SES.

ses.exceptionShelfLog
Message

ses.exceptionShelfLog

Severity

DEBUG

Description

This message occurs when an I/O module encounters an exception condition.

Corrective
action

1. Check the system logs to see whether any disk errors recently occurred.
2. Pull an AutoSupport message file that contains the latest copy of the shelf
log information from each disk shelf.
3. Try to correlate the date and time from the errors in the message file with the
date and time of events in the shelf log file.

ses.extendedShelfLog
Message

ses.extendedShelfLog

Severity

DEBUG

Description

This message occurs when a disk encounters an error and the system requests that
additional log information be obtained from both modules in the disk shelf
reporting the error to aid in debugging problems.

Corrective
action

1. Check the system logs to see whether any disk errors recently occurred.
2. Pull an AutoSupport message file that contains the latest copy of the shelf log
information from each disk shelf.
3. Try to correlate the date and time from the errors in the message file with the
date and time of events in the shelf log file.

272 | Hardware Platform Monitoring Guide

ses.fw.emptyFile
Message

ses.fw.emptyFile

Severity

WARNING

Description

This message occurs when a firmware file is found to be empty during a disk
shelf firmware update.

Corrective action Obtain the correct firmware file and place it in the etc/shelf_fw directory.
You can download the firmware file from the NetApp Support Site at
support.netapp.com/NOW/download/tools/diskshelf/.

ses.fw.resourceNotAvailable
Message

ses.fw.resourceNotAvailable

Severity

ERR

Description

This message occurs when there is not enough contiguous memory available to
download disk shelf firmware.

Corrective action 1. Reduce the amount of system activities before performing a manual disk
shelf firmware update.
2. If the disk shelf firmware update fails again, reboot the storage system.

ses.giveback.restartAfter
Message

ses.giveback.restartAfter

Severity

INFO

Description

This message occurs when SCSI Enclosure Services (SES) is restarted after
giveback.

Corrective action None.

ses.giveback.wait
Message

ses.giveback.wait

Severity

INFO

Description

This message occurs when SCSI Enclosure Services (SES) information is not
available because the system is waiting for giveback.

Corrective action None.

EMS and operational messages | 273

ses.psu.coolingReqError
Message

ses.psu.coolingReqError

Severity

LOG_CRIT

Description

This message occurs when the installed power supplies are placed so that air-flow
requirements of the disk shelf are not met. The power supply chassis and their
power supplies are an integral part of the disk shelf cooling and air-flow design.

Corrective
action

Verify that the power supplies are placed in the locations required to provide
proper air flow according to the disk shelf specifications.
DS14-style shelves always require both power supplies. SAS-Shelf24 requires
power supplies in power supply bays 1 and 4 for proper air flow and cooling.

ses.psu.powerReqError
Message

ses.psu.powerReqError

Severity

LOG_CRIT

Description

This message occurs when too few power supplies are installed to redundantly
satisfy the current-draw requirements of the disk drives in the disk shelf. This
might occur if a power supply is removed or fails. Some disk drive models require
more power than others. If the disk shelf specifications for the installed drive
models specify more power supplies to support that disk type, then this condition
can also occur at disk swap or insertion in some disk shelves.

Corrective
action

Verify that the number of power supplies installed satisfies the power requirements
of the installed disk drives.
DS14-style shelves always require both power supplies. SAS-Shelf24 requires
power supplies in power supply bays 1 and 4 for proper cooling and air flow. If
any disk drives are 10K RPM or faster, then power supply bays 2 and 3 must also
have power supplies.

ses.remote.configPageError
Message

ses.remote.configPageError

Severity

INFO

Description

This message occurs when a request to another system in a SharedStorage


configuration fails. This request was for a specific disk shelf's SCSI Enclosure
Services (SES) configuration page.

Corrective action Contact technical support.

274 | Hardware Platform Monitoring Guide

ses.remote.elemDescPageError
Message

ses.remote.elemDescPageError

Severity

INFO

Description

This message occurs when a request to another system in a SharedStorage


configuration fails. This request was for the element descriptor pages that the
other system has local access to.

Corrective action Contact technical support.

ses.remote.faultLedError
Message

ses.remote.faultLedError

Severity

INFO

Description

This message occurs when a request to another system to have it set the fault
LED of a disk drive on a disk shelf fails.

Corrective action Contact technical support.

ses.remote.flashLedError
Message

ses.remote.flashLedError

Severity

INFO

Description

This message occurs when a request to another system to have it flash the LED
of a disk drive on a disk shelf fails.

Corrective action Contact technical support.

ses.remote.shelfListError
Message

ses.remote.shelfListError

Severity

INFO

Description

This message occurs when a request to another system in a SharedStorage


configuration fails. This request was for a list of the disk shelves that the other
system has local access to.

Corrective action Contact technical support.

ses.remote.statPageError
Message

ses.remote.statPageError

EMS and operational messages | 275


Severity

INFO

Description

This message occurs when a request to another system in a SharedStorage


configuration fails. This request was for the SCSI Enclosure Services (SES)
status pages that the other system has local access to.

Corrective action Contact technical support.

ses.shelf.changedID
Message

ses.shelf.changedID

Severity

WARNING

Description

This message occurs on a SAS disk shelf when the disk shelf ID changes after
power is applied to the disk shelf.

Corrective
action

1. Verify that the disk shelf ID displayed in this message is the same as the disk
shelf ID shown on the disk shelf.
2. If they are different, perform one of the following steps:

If the disk shelf ID displayed in this message is the one you want, reset the
disk shelf ID on the thumbwheel to match it.
If you want the new disk shelf ID instead of the disk shelf ID displayed in
the message, verify that the disk shelf ID you want does not conflict with
other disk shelves in the domain.

3. Power-cycle the disk shelf chassis. You can wait to perform this procedure
until your next maintenance window.
4. If the warning persists on both disk shelf modules after you complete the
procedure, replace the disk shelf chassis. If it persists on only one disk shelf
module, replace the disk shelf module.

ses.shelf.ctrlFailErr
Message

ses.shelf.ctrlFailErr

Severity

SVC_ERROR

Description

This message occurs when the adapter and loop ID of the SCSI Enclosure
Services (SES) target for which the SES has control fail.

Corrective
action

1. Check the LEDs on the disk shelf and the disk shelf modules on the back of
the disk shelf to see whether there are any abnormalities. If the modules
appear to be problematic, replace the applicable module.
2. If the SES target is a disk drive, check to see whether the disk drive failed. If
it failed, replace the disk drive.

276 | Hardware Platform Monitoring Guide

ses.shelf.em.ctrlFailErr
Message

ses.shelf.em.ctrlFailErr

Severity

SVC_ERROR

Description

This message occurs when SCSI Enclosure Services (SES) control to the
internal disk drives of a system fails.

Corrective action 1. Enter environment shelf to see whether that disk shelf is still being
actively monitored.
2. If the environment shelf command indicates a failure, there is a
hardware failure in the system's internal disk shelf.

ses.shelf.IdBasedAddr
Message

ses.shelf.IdBasedAddr

Severity

WARNING

Description

This message occurs on a serial-attached SCSI (SAS) disk shelf when the SAS
addresses of the devices are based on the disk shelf ID instead of the disk shelf
backplane serial number. This indicates problems communicating with the disk
shelf backplane.

Corrective
action

1. Reseat the master disk shelf module, as indicated by the output of the
environment shelf command.
2. If the problem persists, reseat the slave disk shelf module.
3. If the problem persists, find the new master disk shelf module and replace it.
4. If the problem persists, replace the other disk shelf module.
5. If the problem persists, replace the disk shelf enclosure.

ses.shelf.invalNum
Message

ses.shelf.invalNum

Severity

WARNING

Description

This message occurs when Data ONTAP detects that a serial-attached SCSI
(SAS) shelf connected to the system has an invalid shelf number.

Corrective action 1. Power-cycle the shelf.


2. If the problem persists, replace the shelf modules.

EMS and operational messages | 277


3. If the problem persists, replace the shelf.

ses.shelf.mmErr
Message

ses.shelf.mmErr

Severity

SVC_FAULT

Description

This message occurs when there is a disk shelf that is not supported by the
platform it was booted on.

Corrective
action

1. Check whether the current version of Data ONTAP supports the disk shelf.
2. If the current version of Data ONTAP does not support the disk shelf, install
a version that does support the disk shelf.
If the disk shelf is supported, the error might be cleared by hourly attempts by
Data ONTAP to establish proper contact with the disk shelf.

ses.shelf.OSmmErr
Message

ses.shelf.OSmmErr

Severity

SVC_ERROR

Description

This message occurs when there are incompatible Data ONTAP versions in a
SharedStorage configuration that would cause SCSI Enclosure Services (SES)
not to function properly.

Corrective action Update the system that has an earlier Data ONTAP version to match the one
that has the latest Data ONTAP version.

ses.shelf.powercycle.done
Message

ses.shelf.powercycle.done

Severity

INFO

Description

This message occurs when a disk shelf power-cycle finishes.

Corrective action

None.

ses.shelf.powercycle.start
Message

ses.shelf.powercycle.start

Severity

INFO

278 | Hardware Platform Monitoring Guide


Description

This message occurs when a disk shelf is power-cycled and SCSI Enclosure
Services (SES) needs to wait for it to finish.

Corrective action None.

ses.shelf.sameNumReassign
Message

ses.shelf.sameNumReassign

Severity

WARNING

Description

This message occurs when Data ONTAP detects more than one serial-attached
SCSI (SAS) disk shelf connected to the same adapter with the same shelf
number.

Corrective
action

1. Change the shelf number on the shelf to one that does not conflict with other
shelves attached to the same adapter. Halt the system and reboot the shelf.
2. If the problem persists, contact technical support.

ses.shelf.unsupportAllowErr
Message

ses.shelf.unsupportAllowErr

Description

This message occurs when a disk shelf is not supported by Data ONTAP. Data
ONTAP will continue to use the disk shelf, but environmental monitoring of the
disk shelf is not possible.

Severity

SVC_FAULT

Corrective
action

1. Check whether the current version of Data ONTAP supports the disk shelf.
2. If the current version of Data ONTAP does not support the disk shelf, install a
version that does support the disk shelf.
If the disk shelf is supported, the error might be cleared by hourly attempts by
Data ONTAP to establish proper contact with the disk shelf.

ses.shelf.unsupportedErr
Message

ses.shelf.unsupportedErr

Severity

SVC_FAULT

Description

This message occurs when there is a disk shelf that is not supported by Data
ONTAP.

Corrective action Check whether this disk shelf is supported by a newer version of Data ONTAP.
If it is, upgrade to the appropriate version.

EMS and operational messages | 279

ses.startTempOwnership
Message

ses.startTempOwnership

Severity

DEBUG

Description

This message occurs when SCSI Enclosure Services (SES) is starting


temporary ownership acquisition of disks owned by other nodes. This involves
removing the disk reservations while the SES operations are in progress

Corrective action Contact technical support.

ses.status.ATFCXError
Message

ses.status.ATFCXError

Severity

NODE_ERROR

Description

This message occurs when the reporting disk shelf detects an error in the
indicated AT-FCX module. The module might not be able to perform I/O to
disks within the disk shelf.

Corrective action 1. Verify that the AT-FCX module is fully seated and secured.
2. If the problem persists, replace the AT-FCX module.

ses.status.ATFCXInfo
Message

ses.status.ATFCXInfo

Severity

INFO

Description

This message occurs when a previously reported error in the AT-FCX module
is corrected, or the system reports other information that does not necessarily
require customer action.

Corrective action None.

ses.status.currentError
Message

ses.status.currentError

Severity

NODE_ERROR

Description

This message occurs when a critical condition is detected in the indicated


storage shelf current sensor. The shelf might be able to continue operation.

Corrective action 1. Verify that the power supply and the AC line are supplying power.
2. Monitor the power grid for abnormalities.

280 | Hardware Platform Monitoring Guide


3. Replace the power supply.
4. If the problem persists, contact technical support.

ses.status.currentInfo
Message

ses.status.currentInfo

Severity

INFO

Description

This message occurs when an error or warning condition previously reported by


or about the disk shelf current sensor is corrected, or the system reports other
information about the current in the disk shelf that does not necessarily require
customer action.

Corrective action None.

ses.status.currentWarning
Message

ses.status.currentWarning

Severity

WARNING

Description

This message occurs when a warning condition is detected in the indicated


storage shelf current sensor. The shelf might be able to continue operation.

Corrective action 1. Verify that the power supply and the AC line are supplying power.
2. Monitor the power grid for abnormalities.
3. Replace the power supply.
4. If the problem persists, contact technical support.

ses.status.displayError
Message

ses.status.displayError

Severity

NODE_ERROR

Description

This message occurs when the SCSI Enclosure Services (SES) module in the disk
shelf detects an error in the disk shelf display panel. The disk shelf might be
unable to provide correct addresses to its disks.

Corrective
action

1. If possible, verify that the connection between the disk shelf and the display is
secure.
2. Verify that the SES module or modules are fully seated; replacing them might
solve the problem.

EMS and operational messages | 281


3. If the problem persists, the SES module that detected the warning condition
might be faulty.
4. If the problem persists after the module or modules are replaced, replace the
disk shelf.
5. If the problem persists, contact technical support.

ses.status.displayInfo
Message

ses.status.displayInfo

Severity

INFO

Description

This message occurs when a previous condition in the display panel is


corrected.

Corrective action None.

ses.status.displayWarning
Message

ses.status.displayWarning

Severity

WARNING

Description

This message occurs when the SCSI Enclosure Services (SES) module detects a
warning condition for the disk shelf display panel. The disk shelf might be unable
to provide correct addresses to its disks.

Corrective
action

1. If possible, verify that the connection between the disk shelf and the display is
secure.
2. Verify that the SES module or modules are fully seated; replacing them might
solve the problem.
3. If the problem persists, the SES module that detected the warning condition
might be faulty.
4. If the problem persists after the module or modules are replaced, replace the
disk shelf.
5. If the problem persists, contact technical support.

ses.status.driveError
Message

ses.status.driveError

Severity

NODE_ERROR

282 | Hardware Platform Monitoring Guide


Description

This message occurs when a critical condition is detected for the disk drive in
the shelf. The drive might fail.

Corrective
action

1. Make sure that the drive is not running on a degraded volume. If it is, then
add as many spares as necessary into the system, up to the specified level.
2. After the volume is no longer in degraded mode, replace the drive that is
failing.

ses.status.driveOk
Message

ses.status.driveOk

Severity

INFO

Description

This message occurs when a disk drive that was previously experiencing
problem returns to normal operation.

Corrective action None.

ses.status.driveWarning
Message

ses.status.driveWarning

Severity

NODE_ERROR

Description

This message occurs when a non-critical condition is detected for the disk drive
in the shelf. The drive might fail.

Corrective
action

1. Make sure that the drive is not running on a degraded volume. If it is, then
add as many spares as necessary into the system, up to the specified level.
2. After the volume is no longer in degraded mode, replace the drive that is
failing.

ses.status.electronicsError
Message

ses.status.electronicsError

Severity

NODE_ERROR

Description

This message occurs when a failure has been detected in the module that
provides disk SCSI Enclosure Services (SES) monitoring capability.

Corrective action Replace the module. In some disk shelf types, this function is integrated into the
Fibre Channel, SCSI, or serial-attached SCSI (SAS) interface modules.

EMS and operational messages | 283

ses.status.electronicsInfo
Message

ses.status.electronicsInfo

Severity

INFO

Description

This message occurs when a problem previously reported about the disk shelf
SCSI Enclosure Services (SES) electronics is corrected or when other
information about the enclosure electronics that does not necessarily require
customer action is reported.

Corrective action None.

ses.status.electronicsWarn
Message

ses.status.electronicsWarn

Severity

WARNING

Description

This message occurs when a non-fatal condition is detected in the module that
provides disk SCSI Enclosure Services (SES) monitoring capability.

Corrective action Replace the module. In some disk shelf types, this function is integrated into the
Fibre Channel, SCSI, or serial-attached SCSI (SAS) interface modules.

ses.status.ESHPctlStatus
Message

ses.status.ESHPctlStatus

Severity

DEBUG

Description

This message occurs when a change in the power control status is detected in
the indicated disk shelf.

Corrective action None.

ses.status.fanError
Message

ses.status.fanError

Severity

NODE_ERROR

Description

This message occurs when the indicated disk shelf cooling fan or fan module
fails, and the shelf or its components are not receiving required cooling airflow.

Corrective action 1. Verify that the fan module is fully seated and secured. (The fan is integrated
into the power supply module in some disk shelves.)
2. If the problem persists, replace the fan module.

284 | Hardware Platform Monitoring Guide


3. If the problem persists, contact technical support.

ses.status.fanInfo
Message

ses.status.fanInfo

Severity

INFO

Description

This message occurs when a condition previously reported about the disk shelf
cooling fan or fan module is corrected or when other information about the fans
that does not necessarily require customer action is reported.

Corrective action None.

ses.status.fanWarning
Message

ses.status.fanWarning

Severity

WARNING

Description

This message occurs when a disk shelf cooling fan is not operating to
specification, or a component of a fan module has stopped functioning. The disk
shelf components continue to receive cooling airflow but might eventually reach
temperatures that are out of specification.

Corrective
action

1. Verify that the fan module is fully seated and secured. (The fan is integrated
into the power supply module in some disk shelves.)
2. If the problem persists, replace the fan module.
3. If the problem persists, contact technical support.

ses.status.ModuleError
Message

ses.status.ModuleError

Severity

NODE_ERROR

Description

This message occurs when the reporting disk shelf detects an error in the
indicated disk shelf module.

Corrective action 1. Verify that the shelf module is fully seated and secure.
2. If the problem persists, replace the disk shelf module.

ses.status.ModuleInfo
Message

ses.status.ModuleInfo

EMS and operational messages | 285


Severity

INFO

Description

This message occurs when a previously reported error in the shelf module is
corrected or when other information that does not necessarily require customer
action is reported.

Corrective action None.

ses.status.ModuleWarn
Message

ses.status.ModuleWarn

Severity

WARNING

Description

This message occurs when the reporting disk shelf detects a warning in the
indicated disk shelf module.

Corrective action 1. Verify that the shelf module is fully seated and secure.
2. If the problem persists, replace the disk shelf module.

ses.status.psError
Message

ses.status.psError

Severity

NODE_ERROR

Description

This message occurs when a critical condition is detected in the indicated storage
shelf power supply. The power supply might fail.

Corrective
action

1. Verify that power input to the shelf is correct. If separate events of this type
are reported simultaneously, the common power distribution point might be at
fault.
2. If the shelf is in a cabinet, verify that the power distribution unit is ON and
functioning properly. Make sure that the shelf power cords are fully inserted
and secured, the supply is fully seated and secured, and the supply is switched
ON.
3. Verify that power supply fans, if any, are functioning. If the problem persists,
replace the power supply.
4. If the problem persists, contact technical support.

ses.status.psInfo
Message

ses.status.psInfo

Severity

INFO

286 | Hardware Platform Monitoring Guide


Description

This message occurs when a condition previously reported about the disk shelf
power supply is corrected or when other information about the power supply
that does not necessarily require customer action is reported.

Corrective action None.

ses.status.psWarning
Message

ses.status.psWarning

Severity

WARNING

Description

This message occurs when a warning condition is detected in the indicated storage
shelf power supply. The power supply might be able to continue operation.

Corrective
action

1. Verify that the disk shelf is receiving power. If separate events of this type are
reported simultaneously, the common power distribution point might be at
fault.
2. If the disk shelf is in a cabinet, verify that the power distribution unit status is
ON and functioning properly. Make sure that the disk shelf power cords are
fully inserted and secured, the power supply is fully seated and secured, and
the power supply is switched on.
3. If the problem persists, replace the power supply.
4. If the problem persists, contact technical support.

ses.status.temperatureError
Message

ses.status.temperatureError

Severity

NODE_ERROR

Description

This message occurs when the indicated disk shelf temperature sensor reports a
temperature that exceeds the specifications for the disk shelf or its components.

Corrective
action

1. Verify that the ambient temperature where the shelf is installed is within
equipment specifications using the environment shelf [adapter]
command, and that airflow clearances are maintained.
2. If the same disk shelf also reports fan or fan module failures, correct that
problem now. If the problem is reported by the ambient temperature sensor
(located on the operator panel), verify that the connection between the disk
shelf and the panel is secure, if possible.
3. If the problem persists, and if the shelf has multiple temperature sensors of
which only one exhibits the problem, replace the module that contains the

EMS and operational messages | 287


sensor that reports the error. If the problem persists, contact technical support
for assistance.
Note: You can display temperature thresholds for each shelf through the
environment shelf command.

ses.status.temperatureInfo
Message

ses.status.temperatureInfo

Severity

INFO

Description

This message occurs when an error or warning condition previously reported by


or about the disk shelf temperature sensor is corrected or when other
information about the temperature in the disk shelf that does not necessarily
require customer action is reported.

Corrective action None.

ses.status.temperatureWarning
Message

ses.status.temperatureWarning

Severity

WARNING

Description

This message occurs when the indicated disk shelf temperature sensor reports a
temperature that is close to exceeding the specifications for the disk shelf or its
components.

Corrective
action

1. Verify that the ambient temperature where the disk shelf is installed is within
equipment specifications by using the environment shelf [adapter]
command, and that airflow clearances are maintained.
2. If this disk shelf also reports fan or fan module errors or warnings, correct
those problems now.
3. If the problem persists, and the shelf has multiple temperature sensors and only
one of them exhibits the problem, replace the module that contains the sensor.
4. If the problem persists, contact technical support.
Note: Temperature thresholds for each shelf can be displayed through the
environment shelf command.

ses.status.upsError
Message

ses.status.upsError

Severity

NODE_ERROR

288 | Hardware Platform Monitoring Guide


Description

This message occurs when the disk shelf detects a failure in the uninterruptible
power supply (UPS) attached to it. This might occur, for example, if power to
the UPS is lost.

Corrective
action

1. Restore power to the UPS


2. Verify that the connection from the UPS to the disk shelf is in place and
secured and that the UPS is enabled.
3. If the problem persists, contact technical support.

ses.status.upsInfo
Message

ses.status.upsInfo

Severity

INFO

Description

This message occurs when a condition previously reported about the


uninterruptible power supply (UPS) attached to the disk shelf is corrected or
when other information about the UPS that does not necessarily require
customer action is reported.

Corrective action None.

ses.status.volError
Severity

NODE_ERROR

Description

This message occurs when a critical condition is detected in the indicated disk
storage shelf voltage sensor. The shelf might be able to continue operation.

Corrective
action

1. Verify that the power supply and the AC line are supplying power.
2. Monitor the power grid for abnormalities.
3. Replace the power supply.
4. If the problem persists, contact technical support.

ses.status.volWarning
Message

ses.status.volWarning

Severity

WARNING

Description

This message occurs when a warning condition is detected in the indicated


storage shelf voltage sensor. The shelf might be able to continue operation.

Corrective action 1. Verify that the power supply and the AC line are supplying power

EMS and operational messages | 289


2. Monitor the power grid for abnormalities.
3. Replace the power supply.
4. If the problem persists, contact technical support.

ses.system.em.mmErr
Message

ses.system.em.mmErr

Severity

NODE_FAULT

Description

This message occurs when Data ONTAP does not support this system with
internal disk drives.

Corrective action Check whether this system is currently supported. If it is, upgrade to the
appropriate Data ONTAP version.

ses.tempOwnershipDone
Message

ses.tempOwnershipDone

Severity

DEBUG

Description

This message occurs when SCSI Enclosure Services (SES) completes


temporary ownership acquisition.

Corrective action Contact technical support.

sfu.adapterSuspendIO
Message

sfu.adapterSuspendIO

Severity

INFO

Description

This message occurs during a disk shelf firmware update on a disk shelf that
cannot perform I/O while updating firmware. Typically, the shelves involved
are bridge-based as opposed to ESH-based.

Corrective action None.

sfu.auto.update.off.impact
Message

sfu.auto.update.off.impact

Severity

WARNING

Description

This message occurs when the automated disk shelf firmware update cannot be
completed on a downrev disk shelf enclosure because the (hidden) global
option shelf.fw.auto.update is set to off.

290 | Hardware Platform Monitoring Guide


Corrective action Use the storage download shelf command to update. To have the
automatic update enabled, set the hidden option shelf.fw.auto.update to
on.

sfu.ctrllerElmntsPerShelf
Message

sfu.ctrllerElmntsPerShelf

Severity

INFO

Description

This message occurs when a disk shelf firmware download determines the
number of controller elements per shelf that can be downloaded.

Corrective action None.

sfu.downloadCtrllerBridge
Message

sfu.downloadCtrllerBridge

Severity

INFO

Description

This message occurs when a disk shelf firmware download starts on a particular
disk shelf.

Corrective action None.

sfu.downloadError
Message

sfu.downloadError

Severity

ERR

Description

This message occurs when a disk shelf firmware update fails to successfully
download firmware to a disk shelf or shelves in the system.

Corrective action 1. Download the latest disk shelf firmware again from the NetApp Support
Site at support.netapp.com/NOW/download/tools/diskshelf/.
2. Attempt to download disk shelf firmware again by using the storage
download shelf command.

sfu.downloadingController
Message

sfu.downloadingController

Severity

INFO

Description

This message occurs when a disk shelf firmware download starts on a particular
disk shelf.

EMS and operational messages | 291


Corrective action None.

sfu.downloadingCtrllerR1XX
Message

sfu.downloadingCtrllerR1XX

Severity

INFO

Description

This message occurs when a disk shelf firmware download starts on a particular
disk shelf.

Corrective action None.

sfu.downloadStarted
Message

sfu.downloadStarted

Severity

INFO

Description

This message occurs when a disk shelf firmware update starts to download disk
shelf firmware.

Corrective action None.

sfu.downloadSuccess
Message

sfu.downloadSuccess

Severity

INFO

Description

This message occurs when disk shelf firmware is updated successfully.

Corrective action

None.

sfu.downloadSummary
Message

sfu.downloadSummary

Severity

INFO

Description

This message occurs when a disk shelf firmware update is completed


successfully.

Corrective action None.

sfu.downloadSummaryErrors
Message

sfu.downloadSummaryErrors

Severity

ERR

292 | Hardware Platform Monitoring Guide


Description

This message occurs when a disk shelf firmware update is completed without
successfully downloading to all shelves it attempted.

Corrective action Issue the storage download shelf command again.

sfu.FCDownloadFailed
Message

sfu.FCDownloadFailed

Severity

ERR

Description

This message occurs when a disk shelf firmware update fails to successfully
download shelf firmware to a Fibre Channel or an ATA shelf.

Corrective action 1. Download the latest disk shelf firmware again from the NetApp Support
Site at support.netapp.com/NOW/download/tools/diskshelf/.
2. Attempt to download disk shelf firmware again by using the storage
download shelf command.

sfu.firmwareDownrev
Message

sfu.firmwareDownrev

Severity

WARNING

Description

This message occurs when disk shelf firmware is downrev and therefore cannot
be updated automatically.

Corrective action 1. Copy updated disk shelf firmware into the /etc/shelf_fw directory on the
storage appliance.
2. Manually issue the storage download shelf command.

sfu.firmwareUpToDate
Message

sfu.firmwareUpToDate

Severity

INFO

Description

This message occurs when a disk shelf firmware update is requested but the
system determines that all shelves are already updated already to the latest
version of firmware available.

Corrective action None.

sfu.partnerInaccessible
Message

sfu.partnerInaccessible

EMS and operational messages | 293


Severity

ERR

Description

This message occurs in an HA pair in which communication between partner


nodes cannot be established.

Corrective action 1. Verify that the HA pair interconnect is operational.


2. Retry the storage download shelf command.

sfu.partnerNotResponding
Message

sfu.partnerNotResponding

Severity

ERR

Description

This message occurs in an HA pair in which one node does not respond to
firmware download requests from another node. In this case, the other node
cannot download disk shelf firmware.

Corrective
action

Verify that the HA pair interconnect is up and running on both nodes of the
configuration and then attempt to redownload the disk shelf firmware, using the
storage download shelf command.

sfu.partnerRefusedUpdate
Message

sfu.partnerRefusedUpdate

Severity

ERR

Description

This message occurs in an HA pair in which one node refuses firmware


download requests from its partner node. In this case, the partner node cannot
download disk shelf firmware.

Corrective
action

1. Verify that both the partners are running the same version of Data ONTAP
and that the active/active configuration interconnect is up and running on all
nodes of the configuration.
2. Attempt the storage download shelf command again.

sfu.partnerUpdateComplete
Message

sfu.partnerUpdateComplete

Severity

INFO

Description

This message occurs in an HA pair in which a partner downloads disk shelf


firmware and the download is completed. At this point, this notification is sent
and SCSI Enclosure Services (SES) are resumed by the partner.

294 | Hardware Platform Monitoring Guide


Corrective action None.

sfu.partnerUpdateTimeout
Message

sfu.partnerUpdateTimeout

Severity

INFO

Description

This message occurs in an HA pair in which a partner downloads disk shelf


firmware but the download times out. At this point, this notification is sent and
SCSI Enclosure Services (SES) are resumed by the partner.

Corrective action 1. Verify that the HA pair interconnect is operational.


2. Retry the storage download shelf command.

sfu.rebootRequest
Message

sfu.rebootRequest

Severity

INFO

Description

This message occurs when the disk shelf firmware update is completed. The
disk shelf reboots to run the new code.

Corrective action None.

sfu.rebootRequestFailure
Message

sfu.rebootRequestFailure

Severity

ERR

Description

This message occurs when an attempt to issue a reboot request after


downloading shelf firmware fails, indicating a software error.

Corrective action Reboot the storage system, if possible, and try the firmware update again.

sfu.resumeDiskIO
Message

sfu.resumeDiskIO

Severity

INFO

Description

This message occurs when a disk shelf firmware update is completed and disk
I/O is resumed.

Corrective action None.

EMS and operational messages | 295

sfu.SASDownloadFailed
Message

sfu.SASDownloadFailed

Severity

ERR

Description

This message occurs when a disk shelf firmware update fails to successfully
download shelf firmware to a shelf.

Corrective action 1. Download the latest disk shelf firmware again from the NetApp Support
Site at support.netapp.com/NOW/download/tools/diskshelf/.
2. Download disk shelf firmware again by using the storage download
shelf command.

sfu.statusCheckFailure
Message

sfu.statusCheckFailure

Severity

ERR

Description

This message occurs when the storage download shelf command


encounters a failure while attempting to read the status of the firmware update
in progress.

Corrective action Retry the storage download shelf command.

sfu.suspendDiskIO
Message

sfu.suspendDiskIO

Severity

INFO

Description

This message occurs when a disk shelf firmware update is started and disk I/O
is suspended.

Corrective action None.

sfu.suspendSES
Message

Suspending enclosure services -- partner is updating disk shelf firmware.

Severity

INFO

Description

This message occurs when a disk shelf firmware update is requested in an HA


pair environment. In this case, one partner node updates the firmware on the
disk shelf module while the other partner node temporarily disables SCSI
Enclosure Services (SES) while the firmware update is in process.

296 | Hardware Platform Monitoring Guide


Corrective action None.

USB boot device EMS messages


The universal serial bus boot device on FAS25xx,32xx, 62xx, and FAS80xx systems can generate
informational, warning, and error messages. All messages are reported through the EMS.

usb.adapter.debug
Message

usb.adapter.debug

Severity

INFORMATION

Description

This message indicates a Data ONTAP universal serial bus (USB) adapter
driver debug event.

Corrective action None.

usb.adapter.exception
Message

usb.adapter.exception

Severity

WARNING

Description

This message occurs when the Data ONTAP universal serial bus (USB) adapter
driver encounters an error with the adapter. The adapter is reset to recover.

Corrective action None.

usb.adapter.failed
Message

usb.adapter.failed

Severity

ERROR

Description

This message occurs when the Data ONTAP universal serial bus (USB) adapter
driver cannot recover the adapter after resetting it multiple times. The adapter
and the devices attached to it will not be used anymore.

Corrective
action

Take the following actions:


1. If the adapter is in use, verify that all attached devices are supported devices
and that they are seated correctly.
2. If the problem persists, replace the attached devices.
3. If the problem still persists, contact technical support for help in diagnosing a
USB issue.

EMS and operational messages | 297

usb.adapter.reset
Message

usb.adapter.reset

Severity

INFORMATION

Description

This message occurs when the Data ONTAP universal serial bus (USB) driver
resets the specified adapter. This can occur during normal error handling.

Corrective action If the problem persists, then contact technical support.

usb.device.failed
Message

usb.device.failed

Severity

ERROR

Description

This message occurs when multiple consecutive commands to the specified


universal serial bus (USB) device are not completed within the allotted time. All
recovery actions have been taken and the device cannot be used anymore.

Corrective
action

Take the following actions:


1. Ensure that all attached devices are supported devices and that they are
seated correctly.
2. If the problem persists, replace the attached devices.
3. If the problem still persists, contact technical support for help in diagnosing a
USB issue.

usb.device.initialize.failed
Message

usb.device.initialize.failed

Severity

ERROR

Description

This message occurs when the Data ONTAP universal serial bus (USB) adapter
driver fails to initialize the device attached to the associated port in the associated
adapter for one of the following reasons: Cannot set a unique address for the
device; device descriptor is invalid or contains incorrect data; cannot set an active
configuration for the device; or the device had multiple interfaces. Note that the
Data ONTAP USB driver only supports USB 2.0 bulk-only mass storage devices.

Corrective
action

Take one of the following actions:


1. If the device is connected to an external USB port, try reinserting the device.
2. If that fails, try replacing the device with a device from a different product
family.

298 | Hardware Platform Monitoring Guide


3. If the device is connected to the motherboard and the problem persists, contact
technical support for help in diagnosing a USB issue.

usb.device.maximum.connected
Message

usb.device.maximum.connected

Severity

WARNING

Description

This message occurs when the Data ONTAP universal serial bus (USB) adapter
driver detects a new USB device inserted into the associated port in the associated
adapter. This new device cannot be initialized because the maximum number of
USB devices supported by the Data ONTAP USB adapter driver is already
connected to the system.

Corrective
action

Take the following actions:


1. Remove a USB device that is already connected but is not being used.
2. Wait for 10 seconds, then reinsert the new device.

usb.device.protocol.mismatch
Message

usb.device.protocol.mismatch

Severity

ERROR

Description

This message occurs when the Data ONTAP universal serial bus (USB) adapter
driver detects a protocol mismatch in the device attached to the associated port in
the associated adapter. It can be due to one of the following reasons:

Unsupported interface.
Unsupported device class or device subclass.
Does not support the required pipes.
Does not support required end points.
Does not support the required maximum transfer packet size.

Note that the Data ONTAP USB driver only supports USB 2.0 bulk-only mass
storage devices.
Corrective
action

Take one of the following actions:

If the device is connected to an external USB port, try replacing the device with
a device from a different product family.
If the device is connected to the motherboard, contact technical support for help
in diagnosing a USB issue.

EMS and operational messages | 299

usb.device.removed
Message

usb.device.removed

Severity

INFORMATION

Description

This message occurs when the Data ONTAP universal serial bus (USB) adapter
driver successfully detects and handles the removal of the associated device,
and the device is no longer accessible.

Corrective action None.

usb.device.timeout
Message

usb.device.timeout

Severity

ERROR

Description

This message occurs when an outstanding command to the specified universal


serial bus (USB) device is not completed within the allotted time. As part of the
standard error handling sequence managed by the Data ONTAP USB adapter
driver, this command to the device is aborted and reissued.

Corrective
action

Device level timeouts are a common indication of a USB link stability problem. In
some cases, the link is operating normally and the specified device is having
internal trouble processing I/O requests in a timely manner. In such cases, evaluate
the specified device for possible replacement. Quite often the problem results from
the partial failure of a component involved in the USB transport. The most
common thing to check is the seating of the USB device into the USB port or the
header.
Take one of the following actions:

If the device is connected to an external USB port, try replacing the device with
a device from a different product family.
If the device is connected to the motherboard, contact technical support for help
in diagnosing the USB issue.

usb.device.unsupported
Message

usb.device.unsupported

Severity

ERROR

Description

This message occurs when the Data ONTAP universal serial bus (USB) adapter
driver detects an unsupported device attached to the default boot device port on
the motherboard.

Corrective action Contact technical support for a replacement USB boot device.

300 | Hardware Platform Monitoring Guide

usb.device.unsupported.speed
Message

usb.device.unsupported.speed

Severity

ERROR

Description

This message occurs when the Data ONTAP universal serial bus (USB) adapter
driver detects a non high-speed device in the associated port.

Corrective
action

Remove all non high-speed devices attached to the system because the Data
ONTAP USB adapter driver does not support non high-speed devices.

usb.external.device.not.used
Message

usb.external.device.not.used

Severity

WARNING

Description

This message occurs when the Data ONTAP universal serial bus (USB) adapter
driver detects a USB device connected to the external port.

Corrective action Remove the external USB device connected to the system.

usb.externalHub.notSupported
Message

usb.externalHub.notSupported

Severity

WARNING

Description

This message occurs when the Data ONTAP universal serial bus (USB) adapter
driver detects a USB hub device.

Corrective action Remove all hub devices attached to the system because the USB adapter driver
does not support USB hub devices.

usb.port.error
Message

usb.port.error

Severity

ERROR

Description

This message occurs when the Data ONTAP universal serial bus (USB) adapter
driver detects an unrecoverable error on the associated port.

Corrective action Take the following actions:


1. If a device is attached to the associated port, try reinserting the device.
2. If the problem persists, try replacing the device.

EMS and operational messages | 301


3. If the problem still persists, contact technical support for assistance in
diagnosing a USB issue.

usb.port.reset
Message

usb.port.reset

Severity

INFORMATION

Description

This message occurs when the Data ONTAP universal serial bus (USB) adapter
driver resets the specified port on the associated adapter. This can occur during
normal error handling.

Corrective action If the problem persists, contact technical support.

usb.port.state.indeterminate
Message

usb.port.state.indeterminate

Severity

WARNING

Description

This message occurs when the Data ONTAP universal serial bus (USB) adapter
driver cannot determine the status of the associated port.

Corrective
action

Take the following actions:


1. If a device is attached to the associated port, try reinserting the device.
2. If the problem persists, try replacing the device.
3. If the problem still persists, contact technical support for assistance in
diagnosing a USB issue.

usb.port.status.inconsistent
Message

usb.port.status.inconsistent

Severity

ERROR

Description

This message occurs when the Data ONTAP universal serial bus (USB) adapter
driver detects an inconsistent state of the associated port and cannot
communicate with the attached device.

Corrective
action

If a device is attached to the associated port, try reinserting the device. If that
fails, try replacing the device. If the problem persists, contact technical support
for assistance in diagnosing a USB issue.

302 | Hardware Platform Monitoring Guide

usbmon.boot.device.failed
Message

usbmon.boot.device.failed

Severity

ERROR

Description

This message occurs when the Data ONTAP module that is responsible for
monitoring the health of the universal serial bus (USB) boot devices determines
that the associated boot device will fail all writes to the media.

Corrective
action

Take the following actions:


1. Replace the device.
2. If the problem persists, contact technical support for help in diagnosing the
USB issue.

usbmon.boot.device.pfa
Message

usbmon.boot.device.pfa

Severity

WARNING

Description

This message occurs when the Data ONTAP universal serial bus (USB) boot
device health monitor PFA (predictive failure analysis) determines that failure
is forthcoming for the associated boot device.

Corrective action Take the following actions:


1. Replace the device.
2. If the problem persists, contact technical support for help in diagnosing the
USB issue.

usbmon.disable.module
Message

usbmon.disable.module

Severity

INFORMATION

Description

This message occurs when the Data ONTAP module that is responsible for
monitoring the health of the universal serial bus (USB) boot devices is disabled.

Corrective
action

1. Halt the system by entering the following command at the system prompt:
halt

2. After the system boots to the LOADER prompt, run the setenv disableusbmon? false command at the LOADER prompt.

EMS and operational messages | 303


3. Continue to boot the system by entering the following command at the
LOADER prompt:
boot_ontap

usbmon.unable.to.monitor
Message

usbmon.unable.to.monitor

Severity

WARNING

Description

This message occurs when the Data ONTAP module that is responsible for
monitoring the health of the universal serial bus (USB) boot devices cannot
extract health information from the monitored device.

Corrective action Take the following actions:


1. Replace the device.
2. If the problem persists, contact technical support.

Operational error messages


Operational error messages might appear on your system console or LCD when the system is
operating, when it is halted, or when it is restarting because of system problems.

Disk hung during swap


Message

Disk hung during swap

Description

A disk error occurred as you were hot-swapping a disk.

Fatal?

Yes.

Corrective action 1. Disconnect the disk from the power supply by opening the latch and pulling
it halfway out.
2. Wait 15 seconds to allow all disks to spin down.
3. Reinstall the disk.
4. Restart the system by entering the following command:
boot

304 | Hardware Platform Monitoring Guide

Disk n is broken
Message

Disk n is broken

Description

nThe RAID group disk number. The solution depends on whether you have a
hot spare in the system.

Fatal?

No.

Corrective action See the appropriate system administration guide for information about how to
locate a disk based on the RAID group disk number and how to replace a faulty
disk.

Dumping core
Message

Dumping core

Description

The system is dumping core after a system crash.

Fatal?

Yes.

Corrective action Write down the system crash message on the system console and report the
problem to technical support.

Error dumping core


Message

Error dumping core

Description

The system cannot dump core during a system crash and restarts without
dumping core.

Fatal?

Yes.

Corrective action Report the problem to technical support.

FC-AL LINK_FAILURE
Message

FC-AL LINK_FAILURE

Description

Fibre Channel arbitrated loop has link failures.

Fatal?

No.

Corrective action

Report the problem to technical support.

FC-AL RECOVERABLE ERRORS


Message

FC-AL RECOVERABLE ERRORS

EMS and operational messages | 305


Description

Fibre Channel arbitrated loop has been determined to be unreliable. The link
errors are recoverable in the sense that the system is still up and running

Fatal?

No.

Corrective action Report the problem to technical support.

Panicking
Message

Panicking

Description

The system is crashing. If the system does not hang while crashing, the message
Dumping core appears.

Fatal?

Yes.

Corrective action Report the problem to technical support.

RMC Alert: Boot Error


Message

RMC Alert: Boot Error

Description

RMC card sent a DOWN APPLIANCE message. Causes might be a down system,
a boot error, or an OFN POST error.

Fatal?

Yes.

Corrective action Harness script filters them and creates a case.


Contact technical support.

RMC Alert: Down Appliance


Message

RMC Alert: Down Appliance

Description

RMC card sent a DOWN APPLIANCE message. Causes might be a down system,
a boot error, or an OFN POST error.

Fatal?

Yes.

Corrective Action Harness script filters them and creates a case.


Contact technical support.

RMC Alert: OFW POST Error


Message

RMC Alert: OFW POST Error

Description

RMC card sent a DOWN APPLIANCE message. Causes might be a down


system, a boot error, or an OFN POST error.

306 | Hardware Platform Monitoring Guide


Fatal?

Yes.

Corrective action Harness script filters them and creates a case.


Contact technical support.

UTA2 (CNA) error messages


You might see error messages on the system console when configuring a UTA2 (CNA) port or
adapter card.

UTA2 (CNA) error messages on systems operating in maintenance mode or


Data ONTAP 7-Mode
There are specific error messages you might see when configuring a UTA2 (CNA) port or adapter on
a system in maintenance mode or when running Data ONTAP operating in 7-Mode.
The following table describes error messages you might see when configuring an onboard UTA2
(CNA) port or PCIe adapter card:
Condition

Description

Port or
When you
adapter is not attempt to use
configurable the ucadmin
command to
configure a nonUTA2 (CNA)
port or adapter

Command examples

Error message examples

When you attempt to change


the mode, but the adapter you
select is not a UTA2 (CNA)
port or card:

ucadmin modify: Adapter


0c does not support mode
changes

node> ucadmin modify m


cna t target 0c

When you attempt to change


the type, but the port or
adapter you select is not a
UTA2 (CNA) port or card:
node> ucadmin modify t
initiator 3a

ucadmin modify: Adapter


3a does not support FC4
type changes

EMS and operational messages | 307


Condition

Description

Port or
When you
adapter is not attempt to make
offline
changes while
the port or
adapter is online

Command examples

Error message examples

When you started in FC target ucadmin modify: Adapter


0d must be offline
mode and attempt to change
before changing
to another mode or type:
configuration; use the
node> ucadmin modify t "fcp config 0d down"
initiator 0d
command to disable the
adapter and try again

When you started in FC


initiator mode and try to
change to another mode or
type:

ucadmin modify: Adapter


0f must be offline
before changing
configuration; use the
"storage disable adapter
node> ucadmin modify t <name>" command to
target 0f
disable the adapter and
retry the command

308 | Hardware Platform Monitoring Guide


Condition

Description

Command or When you select


parameter is an invalid mode
not
recognized

Command examples

Error message examples

When you try to change the


mode but select an invalid
mode:

ucadmin modify: Invalid


mode argument -- asdf

node> ucadmin modify m


asdf t initiator 0c

When you select When you try to change the


an invalid type
type but select an invalid
type:
node> ucadmin modify t
asdf 3a

Usage: ucadmin modify


[m <mode>] [t
<type>] [-f] <adapter>
Modifies Fibre
Channel and converged
network adapter
configuration
adapter -- adapter
name
-m mode -- fc | cna
-t type -- initiator
| target
-f -- force change
without confirmation
ucadmin modify: Invalid
type argument -- asdf
Usage: ucadmin modify
[m <mode>] [t
<type>] [-f] <adapter>
Modifies Fibre
Channel and converged
network adapter
configuration
adapter -- adapter
name
-m mode -- fc | cna
-t type -- initiator
| target
-f -- force change
without confirmation

UTA2 (CNA) error messages on systems running clustered Data ONTAP


There are specific error messages you might see when configuring a UTA2 (CNA) port or adapter on
a system running clustered Data ONTAP.
The following table describes error messages you might see when configuring an onboard UTA2
(CNA) port or PCIe adapter card from the cluster shell:

EMS and operational messages | 309


Condition

Description

Command examples

Error message examples

Port or
adapter is
not
configurable

When you
attempt to use
the ucadmin
command to
configure a nonUTA2 (CNA)
port or adapter

When you attempt to change


the mode, but the adapter you
select is not a UTA2 (CNA)
port or card:

Error: command failed:


Adapter 0c does not
support mode changes

cluster::> system node


hardware
unifiedconnect modify
-node f-a -adapter 0c
mode cna

When you attempt to change


the type, but the port or
adapter you select is not a
UTA2 (CNA) port or card:

Error: command failed:


Adapter 0c does not
support mode changes

cluster::> system node


hardware unifiedconnect modify node
node1 adapter 0c mode
cna type initiator

Port or
adapter is
not offline

When you
attempt to make
changes while
the port or
adapter is online

When you started in FC target Error: command failed:


mode and attempt to change to Adapter 5a must be
offline before changing
another mode or type:
cluster::> system node
hardware unifiedconnect modify node
node1 adapter 5a mode
cna type target

When you started in FC


initiator mode and try to
change to another mode or
type:
cluster::> system node
hardware
unifiedconnect modify
node node1 adapter 0f
type target

configuration; use the


"fcp adapter modify node
node1 adapter 5a state
down" command to offline
the adapter and try
again
Error: Adapter "0f" must
be offline before
changing configuration;
use the "system node run
local storage disable
adapter <name>" command
to disable the adapter
and retry the command

310 | Hardware Platform Monitoring Guide


Condition

Description

Command or When you select


parameter is an invalid mode
not
recognized

Command examples

Error message examples

When you try to change the


mode but select an invalid
mode:

Error: "asdf" is an
invalid value for field
"mode <fc|cna>"

cluster::> system node


hardware
unifiedconnect modify
node node1 adapter 5a
mode asdf

cluster::> system
node hardware
unifiedconnect
modify ?
-node <nodename> Node
[-adapter] <text>
Adapter
[-mode {fc|uta}]
[-type {initiator|
target}] Configured
FC4 type
[[force|-f] [true]]
Force Configuration
Changes

Error: "asdf" is an
When you select When you try to change the
an invalid type
type but select an invalid type: invalid value for field
cluster::> system node
hardware
unifiedconnect modify
node node1 adapter 5a
mode cna type asdf

"-type <initiator|
target>"
cluster::> system
node hardware
unifiedconnect
modify ?
-node <nodename> Node
[-adapter] <text>
Adapter
[-mode {fc|uta}]
[-type {initiator|
target}] Configured
FC4 type
[[force|-f] [true]]
Force Configuration
Changes

311

Service Processor messages


The Service Processor (SP) enables you to access, monitor, and troubleshoot 22xx, 32xx, 62xx, 80xx,
SA320, and SA620 storage systems remotely. Two types of messages are associated with the SP and
can help you monitor your system and troubleshoot problems.
The SP sends AutoSupport messages when certain problems occur. These might include a loss of
heartbeat or a reboot failure.
Data ONTAP generates EMS messages when SP events and errors occur. These might include a
reminder to configure the SP or an alert to an SP communication problem.
Note: For more information about what the SP does, see the System Administration Guide for the
version of Data ONTAP that your system is running.

When and how SP AutoSupport e-mail messages are sent


The SP generates AutoSupport e-mail messages when the system goes down or when certain
problems occur.
The SP sends the messages under the following conditions:

The storage system reboots unexpectedly.


The storage system stops communicating with the SP.
A watchdog reset occurs.
The watchdog is a built-in hardware sensor that monitors the storage system for a hung or
unresponsive condition. If the watchdog detects such a condition, it resets the storage system so
that the system can automatically reboot and begin functioning.
The storage system is power-cycled.
Firmware power-on self-test (POST) errors occur.
A user initiates an AutoSupport message.
A user resets the system using the SP.

The subject line of e-mail messages contains the word Notification and includes the host name of the
system and the message type. The following text shows an example of an SP AutoSupport e-mail
subject line:
System Notification from host_name (HEARTBEAT_LOSS [WARNING]

Messages are sent to recipients that you designate when you configure AutoSupport in Data ONTAP.
Note: The SP must be properly configured to send AutoSupport messages. For information about
configuring the SP, see the System Administration Guide and the Software Setup Guide for the
version of Data ONTAP that your system is running.

312 | Hardware Platform Monitoring Guide

What SP AutoSupport e-mail messages include


SP AutoSupport e-mail messages have different sections that contain different kinds of information
about your system.
SP e-mail messages include the following sections and information:

Subject line: a system notification from the SP of the system, stating the system condition or
event that caused the AutoSupport message and the log level.
Message body: the SP configuration and version information, the system ID, serial number,
model, and host name.
Attachments: System Event Logs, the system sensor state as determined by the SP, and console
logs.

When and how SP EMS messages are sent


Data ONTAP generates EMS messages when problems occur with the SP and displays them on the
system console.
Problems that trigger EMS messages might include installation of the wrong version of firmware,
communication failure, or a network configuration failure.
The console message includes the name of the EMS message and a brief description of the event or
problem. The following text contains an example of an SP EMS message:
Date [sp.notConfigured:warning] The system's Service Processor (SP) is
not configured. Use the 'sp setup' command to configure it.

SP-generated AutoSupport messages


The SP continuously monitors the system's health and generates AutoSupport messages when
problems occur.

HEARTBEAT_LOSS
Message

HEARTBEAT_LOSS

Description

This message is sent by the Service Processor (SP) when it detects loss of
heartbeat from Data ONTAP, possibly because the system has stopped serving
data.

Corrective
action

If this was a manually triggered or expected reboot, no action is needed.


Otherwise, complete the following steps:

Service Processor messages | 313


1. Check the status of the system and determine whether it is operational.
2. Contact technical support.

REBOOT (abnormal)
Message

REBOOT (abnormal)

Description

This message is sent by the Service Processor (SP) when it detects an abnormal
reboot of the system.

Corrective
action

If this was a manually triggered or expected reboot, no action is needed.


Otherwise, complete the following steps:
1. Check the status of the system and determine the cause of reboot.
2. If the system fails to boot, contact technical support.

SYSTEM_BOOT_FAILED (POST failed)


Message

SYSTEM_BOOT_FAILED (POST failed)

Description

This message is sent by the Service Processor (SP) when the system firmware
has a Power On Self Test (POST) failure and cannot load and run Data
ONTAP.

Corrective action 1. Run diagnostics on your system.


2. Contact technical support.

USER_TRIGGERED (sp test)


Message

USER_TRIGGERED (sp test)

Description

This message is sent by the Service Processor (SP) when the sp test
autosupport command is run from the Data ONTAP CLI. This is a test
mechanism to verify the SP configuration.

Corrective action None.

USER_TRIGGERED (system nmi)


Message

USER_TRIGGERED (system nmi)

Description

This message is sent by the Service Processor (SP) when a user issues a system
core dump (NMI) SP command.

Corrective action None.

314 | Hardware Platform Monitoring Guide

USER_TRIGGERED (system power cycle)


Message

USER_TRIGGERED (system power cycle)

Description

This message is sent by the Service Processor (SP) when a user power-cycles
the system using SP.

Corrective action None.

USER_TRIGGERED (system power off)


Message

USER_TRIGGERED (system power off)

Definition

This message is sent by the Service Processor (SP) when a user powers off the
system using the SP.

Corrective action None.

USER_TRIGGERED (system reset)


Message

USER_TRIGGERED (system reset)

Description

This message is sent by the Service Processor (SP) when a user resets the
system using the SP.

Corrective action None.

EMS messages about the SP


Data ONTAP generates EMS messages when problems occur with the SP.

sp.firmware.upgrade.reqd
Message

sp.firmware.upgrade.reqd

Severity

WARNING

Description

This message occurs when the Service Processor (SP) firmware version and the
Data ONTAP software version are incompatible and cannot communicate
correctly about a particular capability.

Corrective
action

Update the firmware version of the SP to the version recommended for your
version of Data ONTAP. The firmware and update instructions are available on
the NetApp Support Site. After you update the firmware, this message should no
longer occur. If the message occurs again, contact technical support and explain
that you already updated the firmware to the recommended version.

Service Processor messages | 315

sp.firmware.version.unsupported
Message

sp.firmware.version.unsupported

Severity

WARNING

Description

This message occurs when the firmware on the Service Processor (SP) is an
unsupported version and must be upgraded.

Corrective
action

The firmware and instructions are available on the NetApp Support Site at
mysupport.netapp.com. After the SP is running the new firmware, this message
should no longer occur. If the message occurs again, contact technical support and
explain that you already updated the firmware to the recommended version.

sp.heartbeat.resumed
Message

sp.heartbeat.resumed

Severity

INFO

Description

This message occurs when the system detects resumption of Service Processor
(SP) heartbeat notifications indicating that the SP is now available. The earlier
issue indicated by the sp.heartbeat.stopped event has been resolved.

Corrective action None.

sp.heartbeat.stopped
Message

sp.heartbeat.stopped

Severity

WARNING

Description

This message occurs when Data ONTAP does not receive expected Service
Processor (SP) heartbeat notifications. The SP and Data ONTAP exchange
heartbeat messages so that they can detect when one or the other is unavailable.
This event is generated when Data ONTAP has not received an expected
heartbeat message from the SP.

Corrective
action

1. Connect to the SP CLI and enter the following commands:


sp version
priv set advanced
sp log debug
sp log messages

2. Run SP system diagnostics.

316 | Hardware Platform Monitoring Guide


3. If you still see this EMS message, contact technical support.

sp.network.link.down
Message

sp.network.link.down

Severity

WARNING

Description

This message occurs when the Service Processor (SP) detects a link error on the
SP network port. This can happen if a network cable is not plugged into the SP
network port. It can also happen if the network that the SP is connected to cannot
run at 10/100 Mbps.

Corrective
action

1. Check whether the network cable is correctly plugged into the SP network
port.
2. Check the link status LED on the SP.
3. Verify that the network that the SP is connected to supports autonegotiation to
10/100 Mbps or is running at one of those speeds; otherwise, SP network
connectivity does not work.
The SP supports a 10/100 Mbps Ethernet network in autonegotiation mode.

sp.notConfigured
Message

sp.notConfigured

Severity

WARNING

Description

This message occurs weekly to remind you to configure the Service Processor
(SP). The SP is a physical device that is incorporated into your system to provide
remote access and remote management capabilities. To use the full functionality
of SP, you must configure it first.

Corrective
action

Ensure that AutoSupport mailhosts and recipients are properly configured in Data
ONTAP, and then take the following actions:
1. Configure the SP by entering the following command:
sp setup

If necessary, use the sp status command to obtain the SP's MAC address.
2. Verify the SP network configuration by entering the following command:
sp status

3. Verify that the SP can send AutoSupport messages by entering the following
command:
sp test autosupport

Service Processor messages | 317

sp.orftp.failed
Message

sp.orftp.failed

Severity

WARNING

Description

This message occurs when there is a communication error while sending


information to or receiving information from the Service Processor (SP). This
error could be due to the following reasons:

Corrective
action

Communication error while the information is being sent or received.


SP is nonoperational.

1. Check whether the SP is operational by entering the following command at


the Data ONTAP prompt:
sp status

2. If the SP is operational and this message persists, reboot the SP by entering


the following command at the Data ONTAP prompt:
sp reboot

3. If this message persists after you reboot the SP, contact technical support.

sp.snmp.traps.off
Message

sp.snmp.traps.off

Severity

INFO

Description

This message occurs each time a system boots, if the advanced privilege level in
Data ONTAP was used to disable the SNMP Trap feature of the Service
Processor (SP).
This message also occurs when the SNMP Trap capability is disabled and a user
invokes a Data ONTAP command to use the SP to send an SNMP trap.

Corrective
action

SP SNMP Trap support is currently disabled. To enable this feature, set the
sp.snmp.traps option to On.

sp.userlist.update.failed
Message

sp.userlist.update.failed

Severity

WARNING

318 | Hardware Platform Monitoring Guide


Description

This message occurs when there is an error updating user information for the
Service Processor (SP). When user information is updated on Data ONTAP, the SP
is also updated with the new changes. This enables users to log in to the SP.
User information update for the Service Processor (SP) may have failed due to the
following reasons:

Corrective
action

Communication error with the SP.


SP might not be operational.

1. Check whether the SP is operational by entering the following command at the


Data ONTAP prompt:
sp status

2. If the SP is operational and this message persists, reboot the SP by entering the
following command at the Data ONTAP prompt:
sp reboot

3. Retry the operation that caused the error message.


4. If this message persists after you reboot the SP, contact technical support.

spmgmt.driver.hourly.stats
Message

spmgmt.driver.hourly.stats

Severity

WARNING

Description

This message occurs when the system encounters an error while trying to get
hourly statistics from the Service Processor (SP). The error could be due to the
following reasons:

Corrective
action

Communication error with the (SP).


SP is not operational.

1. Check whether the SP is online by entering the following command at the Data
ONTAP prompt:
sp status

2. If the SP is online and this message persists, reboot the SP by entering the
following command at the Data ONTAP prompt:
sp reboot

3. If this message persists after you reboot the SP, contact technical support.

Service Processor messages | 319

spmgmt.driver.mailhost
Message

spmgmt.driver.mailhost

Severity

WARNING

Description

This message occurs when the Service Processor (SP) setup attempts to verify
whether a mailhost specified in Data ONTAP can be reached. In this case, SP
setup cannot connect to the specified mailhost.

Corrective
action

1. Verify that a valid mailhost is configured in Data ONTAP by checking the


system AutoSupport configuration.
2. Ensure that Data ONTAP can successfully connect to the specified mailhost
by invoking a test command to invoke AutoSupport.

spmgmt.driver.network.failure
Message

spmgmt.driver.network.failure

Severity

WARNING

Description

This message occurs when the system encounters a failure during network
configuration of the Service Processor (SP). The system cannot assign the SP a
DHCP (Dynamic Host Configuration Protocol) or fixed IP address.

Corrective
action

1. Check whether the network cable is correctly plugged into the SP network port.
2. Check the link status LED on the SP.
3. Verify that the network that the SP is connected to supports autonegotiation to
10/100 speed or is running at one of those speeds; otherwise, SP network
connectivity does not work.
The SP supports a 10/100 Ethernet network in autonegotiation mode.

spmgmt.driver.timeout
Message

spmgmt.driver.timeout

Severity

WARNING

Description

This message occurs when there is a failure during communication with the
Service Processor (SP) firmware. The failure could be due to the following
reasons:

Communication error with the SP.


SP is not operational.

320 | Hardware Platform Monitoring Guide


Corrective
action

1. Check whether the SP is online by entering the following command at the Data
ONTAP prompt:
sp status

2. If the SP is operational and this message persists, reboot the SP by entering the
following command at the Data ONTAP prompt:
sp reboot

After the reboot, this message should no longer occur. If the message occurs
again, contact technical support and explain that you already performed the
preceding steps.

321

RLM messages
The RLM provides remote management capabilities for some storage systems and continuously
monitors system health. Two types of messages are associated with the RLM and can help you
monitor your system and troubleshoot problems.
The following systems contain RLMs:

30xx and SA300 systems


31xx systems
60xx and SA600 systems

The RLM sends AutoSupport messages when certain problems occur with the system. These might
include a reboot failure or a user-triggered power cycle.
Data ONTAP generates EMS messages when RLM events and errors occur. These might include a
firmware update failure or a communication error.
Note: For more information about what the RLM does, see the System Administration Guide for
the version of Data ONTAP that your system is running.

When and how RLM AutoSupport e-mail messages are sent


The RLM generates AutoSupport e-mail messages when the system goes down or when certain
problems occur.
The RLM sends AutoSupport e-mail messages under the following conditions:

The system reboots unexpectedly


The system stops communicating with the RLM
A watchdog reset occurs
The system is power-cycled
Firmware POST errors occur
A user-initiated AutoSupport message occurs

The subject line of e-mail messages contains the words "System Notification" and includes the host
name of the system and the message type. The following text shows an example of an RLM
AutoSupport e-mail subject line: System Notification from system (RLM HBT
STOPPED)CRITICAL

Messages are sent to recipients that you designate when you configure AutoSupport in Data ONTAP.
Note: The RLM must be properly configured to send AutoSupport messages. For information
about configuring the RLM, see the System Administration Guide and the Software Setup Guide
for the version of Data ONTAP that your system is running.

322 | Hardware Platform Monitoring Guide

What RLM AutoSupport e-mail messages include


RLM AutoSupport e-mail messages have different sections that contain different kinds of
information about your system.
RLM e-mail messages include the following sections and information:

Subject line: a system notification from the RLM of the system, stating the system condition or
event that caused the AutoSupport message and the log level.
Message body: the RLM configuration and version information, the system ID, serial number,
model number, and host name.
Attachments: SELs, the system sensor state as determined by the RLM, and console logs.
Note: For more information about the contents of AutoSupport messages, see the System
Administration Guide for the version of Data ONTAP running on your system.

When and how RLM EMS messages are sent


Data ONTAP generates EMS messages when problems occur with the RLM and displays them on
the system console.
Problems that trigger EMS messages might include failed network configuration, failed RLM
heartbeat, or firmware update errors.
The console message includes the name of the EMS message and a brief description of the event or
problem. The following text contains an example of an RLM EMS message:
[rlm.orftp.failed:warning]: RLM communication error, unsupported send
request

RLM-generated AutoSupport messages


The RLM continuously monitors the system's health and generates AutoSupport messages when the
system goes down or when other problems, such as startup errors, occur.

Heartbeat loss warning


Message

Heartbeat loss warning

Description

The Remote LAN Module (RLM) detects that the system is offline, possibly
because the system stopped serving data.

RLM messages | 323


Corrective
action

If this system shutdown was manually triggered, no action is necessary.


Otherwise, complete the following steps.
1. Check the status of your system and verify that the system and disk shelves
are operational.
2. Contact technical support if the problem persists.

Reboot (power loss) critical


Message

Reboot (power loss) critical

Description

The Remote LAN Module (RLM) detects that the system lost AC power.

Corrective action If you switched off the system before you received the notification, no action is
necessary. Otherwise, restore power to the system.

Reboot (watchdog reset) warning


Message

Reboot (watchdog reset) warning

Description

The Remote LAN Module (RLM) detects a watchdog reset error.

Corrective action

1. Check the system to verify that it is operational.


2. If your system is operational, run diagnostics on your entire system.
3. Contact technical support if the storage system is not serving data.

Reboot warning
Message

Reboot warning

Description

The Remote LAN Module (RLM) detects an abnormal system reboot.

Corrective action If this was a manually triggered or expected reboot, no action is necessary.
Otherwise, complete the following steps.
1. Check the status of the system and determine the cause of the reboot.
2. Contact technical support if the system fails to reboot.

RLM heartbeat loss


Message

RLM heartbeat loss

Description

The Remote LAN Module (RLM) detects the loss of heartbeat from Data
ONTAP. The system possibly stopped serving data.

324 | Hardware Platform Monitoring Guide


Corrective action 1. Connect to the RLM command-line interface (CLI) to check whether the
RLM is operational.
2. Contact technical support if the problem persists.

RLM heartbeat stopped


Message

RLM heartbeat stopped

Description

The system software cannot see the RLM.

Corrective action 1. Connect to the RLM command-line interface (CLI) to check whether the
RLM is operational.
2. Contact technical support if the problem persists.

System boot failed (POST failed)


Message

System boot failed (POST failed)

Description

The Remote LAN Module (RLM) detects that a system error occurred during
the POST and the system software cannot be booted.

Corrective action 1. Run diagnostics on your system.


2. Contact technical support if running diagnostics does not detect any faulty
components.

User triggered (RLM test)


Message

User triggered (RLM test)

Description

The Remote LAN Module (RLM) received the rlm test command, which
tests the RLM configuration.

Corrective action No action is necessary.

User_triggered (system nmi)


Message

User_triggered (system nmi)

Description

A user is initiating a system core dump (nmi) through the Remote LAN Module
(RLM).

Corrective action No action is necessary.

RLM messages | 325

User_triggered (system power cycle)


Message

User_triggered (system power cycle)

Description

A user is initiating a system power-cycle through the Remote LAN Module


(RLM).

Corrective action No action is necessary.

User_triggered (system power off)


Message

User_triggered (system power off)

Description

A user is powering off the system through the Remote LAN Module (RLM).

Corrective action No action is necessary.

User_triggered (system power on)


Message

User_triggered (system power on)

Description

A user is powering on the system through the Remote LAN Module (RLM).

Corrective action No action is necessary.

User_triggered (system reset)


Message

User_triggered (system reset)

Description

A user is resetting the system through the Remote LAN Module (RLM).

Corrective action

No action is necessary.

EMS messages about the RLM


Data ONTAP generates EMS messages when problems occur with the RLM. These problems might
include failed network configuration or firmware update errors.

rlm.driver.hourly.stats
Message

rlm.driver.hourly.stats

Severity

Warning

Description

The system encountered an error while trying to get hourly statistics from the
Remote LAN Module (RLM).

326 | Hardware Platform Monitoring Guide


Corrective action 1. Check whether the RLM is online by entering the following command at the
Data ONTAP prompt:
rlm status

2. If the RLM is operational and the problem persists, enter the following
command to reboot the RLM:
rlm reboot

rlm.driver.mailhost
Message

rlm.driver.mailhost

Severity

Warning

Description

This message occurs when Remote LAN Module (RLM) setup verifies whether
a mailhost specified in ONTAP can be reached. In this case, RLM setup cannot
connect to the specified mailhost.

Corrective action 1. Verify that a valid mailhost is configured in Data ONTAP by checking the
system AutoSupport configuration.
2. Ensure that ONTAP can successfully connect to the specified mailhost by
entering a test AutoSupport command.

rlm.driver.network.failure
Message

rlm.driver.network.failure

Severity

Warning

Description

A failure occurred during the network configuration of the Remote LAN Module
(RLM). The system could not assign the RLM a Dynamic Host Configuration
Protocol (DHCP) or fixed IP address.

Corrective
action

1. Check whether the RLM is online by entering the following command at the
Data ONTAP prompt:
rlm status

2. If the RLM is operational and the problem persists, enter the following
command to reboot the RLM:
rlm reboot

rlm.driver.timeout
Message

rlm.driver.timeout

RLM messages | 327


Severity

Warning

Description

A failure occurred during communication with the Remote LAN Module


(RLM).

Corrective action 1. Check whether the RLM is online by entering the following command at the
Data ONTAP prompt:
rlm status

2. If the RLM is operational and the problem persists, enter the following
command to reboot the RLM:
rlm reboot

rlm.firmware.update.failed
Message

rlm.firmware.update.failed

Severity

SVC_ERROR

Description

An error occurred during an update to the Remote LAN Module (RLM) firmware.
The firmware might have failed due to the following reasons:

Corrective
action

An incorrect RLM firmware image or a corrupted image file


A communication error while sending new firmware to the RLM
An update failure while applying new firmware at the RLM
A system reset or loss of power during an update

1. Download the firmware image by entering the commands appropriate to your


system:
If you are
using...

Then run the following command(s):

Clustered
Data
ONTAP

a. system image get -node node_name -package


http://web_server_name/path/RLM_FW.zip
-replace-package true

b. run -node node_name


c. software install RLM_FW.zip
Data
software install http://web_server_name/path/
ONTAP in 7- RLM_FW.zip -f
Mode
2. Make sure that the RLM is still operational by entering the command
appropriate to your system:

328 | Hardware Platform Monitoring Guide

If you are
using...

Then run the following command:

Clustered
system node run -node node_name rlm status
Data ONTAP
Data ONTAP rlm status
in 7-Mode
3. Retry updating the RLM firmware.
For more information, see the section on updating RLM firmware in the System
Administration Guide for the version of Data ONTAP that your system is
running.
4. If the failure persists, contact technical support.

rlm.firmware.upgrade.reqd
Message

rlm.firmware.upgrade.reqd

Severity

WARNING

Description

The Remote LAN Module (RLM) firmware version and the version of Data
ONTAP are incompatible and cannot communicate correctly about a particular
capability.

Corrective action Update the firmware version of the RLM to the version recommended for your
version of Data ONTAP.
For more information, see the section on upgrading RLM firmware in the
System Administration Guide.

rlm.firmware.version.unsupported
Message

rlm.firmware.version.unsupported

Severity

WARNING

Description

The firmware on the Remote LAN Module (RLM) is an unsupported version


and must be upgraded.

Corrective
action

Update the firmware version of the RLM to the version recommended for your
version of Data ONTAP.
For more information, see the section on upgrading RLM firmware in the
System Administration Guide.

RLM messages | 329

rlm.heartbeat.bootFromBackup
Message

rlm.heartbeat.bootFromBackup

Severity

WARNING

Description

The system rebooted the Remote LAN Module (RLM) from its backup firmware
to restore RLM availability. The RLM is considered unavailable when the system
stops receiving heartbeat notifications from the RLM. To restore availability, the
system tries to reboot the RLM form the RLM's primary firmware. If that fails, the
system tries to reboot the RLM from the RLM's backup firmware. This message is
generated if the reboot from backup firmware restores availability.

Corrective
action

Update the firmware version of the RLM to the version recommended for your
version of Data ONTAP.
For more information, see the section on upgrading RLM firmware in the System
Administration Guide.

rlm.heartbeat.resumed
Message

rlm.heartbeat.resumed

Severity

WARNING

Description

The system detected the resumption of Remote LAN Module (RLM) heartbeat
notifications, indicating that the RLM is now available. The earlier issue
indicated by the rlm.heartbeat.stopped message was resolved.

Corrective action None needed.

rlm.heartbeat.stopped
Message

rlm.heartbeat.stopped

Severity

WARNING

Description

The system did not receive an expected heartbeat message from the Remote LAN
Module (RLM). The RLM and the system exchange heartbeat messages, which
they use to detect when one or the other is unavailable.

Corrective
action

1. Connect to the RLM CLI.


2. Collect debugging information by entering the following commands:
rlm version
rlm config
priv set advanced

330 | Hardware Platform Monitoring Guide


rlm log debug
rlm log messages

3. Run the RLM diagnostics:


a. From the boot loader prompt, enter
boot_diags

b. When the diagnostics main menu appears, select agent.


c. To test the syst/agent/RLM interface, select tests 2 and 6.
4. See the section on troubleshooting RLM problems in the System
Administration Guide.
5. If the problem persists, contact technical support.

rlm.network.link.down
Message

rlm.network.link.down

Severity

WARNING

Description

The Remote LAN Module (RLM) detected a link error on the RLM network port.
This can happen if a network cable is not plugged into the RLM network port. It
can also happen if the network that the RLM is connected to cannot run at 10/100
Mbps.

Corrective
action

1. Check whether the network cable is correctly plugged into the RLM network
port.
2. Check the link status LED on the RLM.
3. Verify that the network that the RLM is connected to supports autonegotiation
to 10/100 Mbps or is running at one of those speeds; otherwise, RLM network
connectivity does not work.

rlm.notConfigured
Message

rlm.notConfigured

Severity

WARNING

Description

This message occurs weekly to remind you to configure the Remote LAN Module
(RLM). The RLM is a physical device that is incorporated into your system to
provide remote access and remote management capabilities. To use the full
functionality of RLM, you need to configure it first.

RLM messages | 331


Corrective
action

1. Use the rlm setup command to configure the RLM.


If necessary, use the rlm status command to obtain its MAC address.
2. Use the rlm status command to verify the RLM network configuration.
3. Use the rlm test autosupport command to verify that the RLM can send
AutoSupport e-mail.
Note that AutoSupport mailhosts and recipients must be properly configured
in Data ONTAP before issuing this command.

rlm.orftp.failed
Message

rlm.orftp.failed

Severity

WARNING

Description

A communication error occurred while sending or receiving information from


the Remote LAN Module (RLM).

Corrective action 1. Check whether the RLM is operational by entering the following command
at the Data ONTAP prompt:
rlm status

2. If the RLM is operational and this error persists, enter the following
command to reboot the RLM:
rlm reboot

3. If this message persists after you reboot the RLM, contact technical support.

rlm.snmp.traps.off
Message

rlm.snmp.traps.off

Severity

INFO

Description

The advanced privilege level in Data ONTAP was used to disable the SNMP
trap feature of the Remote LAN Module (RLM). This message occurs at boot.
This message also occurs when the SNMP trap capability was disabled and a
user invokes a Data ONTAP command to use the RLM to send an SNMP trap.

Corrective
action

To enable RLM SNMP trap support, set the rlm.snmp.traps option to On.

rlm.systemDown.alert
Message

rlm.systemDown.alert

332 | Hardware Platform Monitoring Guide


Severity

ALERT

Description

System remote management detected a system down event.


This is only an SNMP trap that is sent out by the Remote LAN Module (RLM)
firmware. The trap includes a string describing the specific event that triggered
the trap. The string is structured in the following form with key=value pairs:
Remote Management Event: type={system_down|system_up|test|
keep_alive}, severity={alert|warning|
notice|normal|debug|info}, event={post_error|
watchdog_reset|power_loss}

Corrective
action

1. Check the system to verify that it has power and is operational.


2. If your system is operational, run diagnostics on your entire system.
3. Contact technical support if the system is not serving data.

rlm.systemDown.notice
Message

rlm.systemDown.notice

Severity

NOTICE

Description

System remote management detected a system down event.


This is only an SNMP trap that is sent out by the Remote LAN Module (RLM)
firmware. The trap includes a string describing the specific event that triggered the
trap. The string is structured in the following form with key=value pairs:
Remote Management Event: type={system_down|system_up|test|
keep_alive}, severity={alert|warning|notice|normal|debug|
info}, event={power_off_via_rlm|power_cycle_via_rlm|
reset_via_rlm}

Corrective
action

1. Check the system to verify that it has power and is operational.


2. If your system is operational, run diagnostics on your entire system.
3. Consult technical support if the system is not serving data.

rlm.systemDown.warning
Message

rlm.systemDown.warning

Severity

WARNING

Description

System remote management detected a system down event.

RLM messages | 333


This is only an SNMP trap that is sent out by the Remote LAN Module (RLM)
firmware. The trap includes a string describing the specific event that triggered the
trap. The string is structured in the following form with key=value pairs:
Remote Management Event: type={system_down|system_up|test|
keep_alive}, severity={alert|warning|notice|normal|debug|
info}, event={loss_of_heartbeat}

Corrective
action

1. Check the system to verify that it has power and is operational.


2. If your system is operational, run diagnostics on your entire system.
3. Consult technical support if the system is not serving data.

rlm.systemPeriodic.keepAlive
Message

rlm.systemPeriodic.keepAlive

Severity

INFO

Description

System remote management sent a periodic keep-alive event.


This is only an SNMP trap that is sent out by the Remote LAN Module (RLM)
firmware. The trap includes a string describing the specific event that triggered
the trap. The string is structured in the following form with key=value pairs:
Remote Management Event: type={system_down|system_up|test|
keep_alive}, severity={alert|warning|notice|normal|debug|
info}, event={periodic_message}

Corrective
action

None needed.

rlm.systemTest.notice
Message

rlm.systemTest.notice

Severity

NOTICE

Description

System remote management detected a test event.


This is only an SNMP trap that is sent out by the Remote LAN Module (RLM)
firmware. The trap includes a string describing the specific event that triggered
the trap. The string is structured in the following form with key=value pairs:

334 | Hardware Platform Monitoring Guide

Remote Management Event: type={system_down|system_up|


test|keep_alive}, severity={alert|warning|notice|normal|
debug|info}, event={test}

Corrective
action

None needed.

rlm.userlist.update.failed
Message

rlm.userlist.update.failed

Severity

WARNING

Description

There was an error while updating user information for the Remote LAN Module
(RLM). When user information is updated on Data ONTAP, the RLM is also
updated with the new changes. This enables users to log in to the RLM.

Corrective
action

1. Check whether the RLM is operational by entering the following command at


the Data ONTAP prompt:
rlm status

2. If the RLM is operational and this error persists, reboot the RLM by entering
the following command:
rlm reboot

3. Retry the operation that caused the error message.


4. If this message persists after you reboot the RLM, contact technical support.

335

BMC messages
The BMC provides remote platform management capabilities on FAS20xx and SA200 systems.
BMC capabilities include remote access, monitoring, troubleshooting, logging, and alerting features.
The BMC sends AutoSupport messages through its independent management interface, regardless of
the state of the system.

How and when BMC AutoSupport e-mail notifications are


sent
BMC e-mail notifications are sent to configured recipients designated by the AutoSupport feature.
The e-mail notifications have the title System Alert from BMC of filer serial number," followed
by the message type. The serial number is that of the controller with which the BMC is
associated.
Typical BMC-generated AutoSupport messages occur under the following conditions:

The system reboots unexpectedly


A system reboot fails
A user-issued action triggers an AutoSupport message

What BMC e-mail notifications include


The different parts of BMC e-mail messages contain information about your system.
BMC e-mail notifications include the following information:

Subject line: a system notification from the BMC of the system, listing the system condition or
event that cause the AutoSupport message and the log level.
Message body: the IP address, netmask, and other information about the system.
Attachments: system configuration and sensor information.

BMC-generated AutoSupport messages


The BMC can generate a variety of messages telling you of problems or events occurring on your
system.

336 | Hardware Platform Monitoring Guide

BMC_ASUP_UNKNOWN
Message

BMC_ASUP_UNKNOWN

Description

Unknown Baseboard Management Controller (BMC) error.

Corrective action

Report the problem to technical support.

REBOOT (abnormal)
Message

REBOOT (abnormal)

Explanation

An abnormal reboot occurred.

Corrective action

Verify that the system has returned to operation.

REBOOT (power loss)


Message

REBOOT (power loss)

Description

A power failure was detected, and the system restarted. This occurs when the
system is power-cycled by the external switches or in a true power loss.

Corrective action Verify that the system has returned to operation.

REBOOT (watchdog reset)


Message

REBOOT (watchdog reset)

Description

The system stopped responding and was rebooted by the Baseboard


Management Controller (BMC). This occurs when the BMC watchdog is
triggered.

Corrective action Verify that the system has returned to operation.

SYSTEM_BOOT_FAILED (POST failed)


Message

SYSTEM_BOOT_FAILED (POST failed)

Description

The system failed to pass the BIOS POST. This occurs when the BIOS status
sensor is in a failed or hung state.

Corrective
action

1. Issue a system reset backup command from the Baseboard Management


Controller (BMC) console, and if the system can come up to the boot loader,
issue the flash command to update the primary BIOS firmware.
2. If the system is still nonresponsive, contact technical support.

BMC messages | 337

SYSTEM_POWER_OFF (environment)
Message

SYSTEM_POWER_OFF (environment)

Description

An environmental sensor entered a critical, nonrecoverable state, and Data


ONTAP has been requested to power off the system.

Corrective action Verify the environmental conditions of the system.

USER_TRIGGERED (bmc test)


Message

USER_TRIGGERED (bmc test)

Description

A user triggered the Baseboard Management Controller (BMC) AutoSupport


internal test through the BMC console, Systems Management Architecture for
Server Hardware (SMASH), or Intelligent Platform Management Interface
(IPMI).

Corrective action Verify that the command was issued by an authorized user.

USER_TRIGGERED (system nmi)


Message

USER_TRIGGERED (system nmi)

Description

A user requested a core dump through the BMC console, SMASH, or IPMI.

Corrective action Verify that the command was issued by an authorized user.

USER_TRIGGERED (system power cycle)


Message

USER_TRIGGERED (system power cycle)

Description

A user issued a power-cycle command through the Baseboard Management


Controller (BMC) console, Systems Management Architecture for Server
Hardware (SMASH), or Intelligent Platform Management Interface (IPMI).

Corrective action Verify that the command was issued by an authorized user.

USER_TRIGGERED (system power off)


Message

USER_TRIGGERED (system power off)

Description

A user issued a power off command through the Baseboard Management


Controller (BMC) console, Systems Management Architecture for Server
Hardware (SMASH), or Intelligent Platform Management Interface (IPMI).

Corrective action Verify that the command was issued by an authorized user.

338 | Hardware Platform Monitoring Guide

USER_TRIGGERED (system power on)


Message

USER_TRIGGERED (system power on)

Description

A user issued a power on command through the Baseboard Management


Controller (BMC) console, Systems Management Architecture for Server
Hardware (SMASH), or Intelligent Platform Management Interface (IPMI).

Corrective action Verify that the command was issued by an authorized user.

USER_TRIGGERED (system power soft-off)


Message

USER_TRIGGERED (system power soft-off)

Description

A user issued a power soft-off command through the Baseboard Management


Controller (BMC) console, Systems Management Architecture for Server
Hardware (SMASH), or Intelligent Platform Management Interface (IPMI).

Corrective action Verify that the command was issued by an authorized user.

USER_TRIGGERED (system reset)


Message

USER_TRIGGERED (system reset)

Description

A user issued a reset command through the Baseboard Management Controller


(BMC) console, Systems Management Architecture for Server Hardware
(SMASH), or Intelligent Platform Management Interface (IPMI).

Corrective action Verify that the command was issued by an authorized user.

EMS messages about the BMC


The EMS might send messages to your system console about the BMC.

bmc.asup.crit
Message

bmc.asup.crit

Description

This message occurs when the Baseboard Management Controller (BMC) sends
an AutoSupport message of a CRITICAL priority.

Corrective
action

The action you take depends on whether the operating environment for the
system, storage, or associated cabling has changed.

If the operating environment has changed, shut down and power off the
system until the environment is restored to normal operations.

BMC messages | 339

If the operating environment has not changed, check for previous errors and
warnings. Also check for hardware statistics from Fibre Channel, SCSI, disk
drives, other communications mechanisms, and previous administrative
activities.

bmc.asup.error
Message

bmc.asup.error

Description

This message occurs when the Baseboard Management Controller (BMC) fails
to construct the necessary attachments of an AutoSupport message.

Corrective action This message indicates an internal error with the BMC's AutoSupport
processing. Contact technical support.

bmc.asup.init
Message

bmc.asup.init

Description

This message occurs when the Baseboard Management Controller (BMC) fails
to initialize its AutoSupport subsystem due to a lack of resources.

Corrective action This message indicates an internal error with the BMC's AutoSupport
processing. Contact technical support.

bmc.asup.queue
Message

bmc.asup.queue

Description

This message occurs when the Baseboard Management Controller (BMC) has
too many outstanding AutoSupport messages and no longer has enough
resources to service them.

Corrective
action

This message might indicate an issue with your AutoSupport configuration.


1. Ensure that your system is configured to use the correct AutoSupport SMTP
mail host, and that the mail host is properly configured to handle
AutoSupport messages originating from the BMC.
2. For additional help, contact technical support.

bmc.asup.send
Message

bmc.asup.send

Description

This message occurs when the Baseboard Management Controller (BMC) sends
an AutoSupport message.

340 | Hardware Platform Monitoring Guide


Corrective action 1. Follow the corrective action recommended for the AutoSupport message
that was sent.
2. For additional help, contact technical support.

bmc.asup.smtp
Message

bmc.asup.smtp

Description

This message occurs when the Baseboard Management Controller (BMC) fails
to contact the mailhost when attempting to send an AutoSupport message.

Corrective
action

This message indicates an issue with your AutoSupport configuration.


1. Ensure that your system is configured to use the correct AutoSupport SMTP
mail host and that the mail host is properly configured to handle AutoSupport
messages originating from the BMC.
2. For additional help, contact technical support.

bmc.batt.id
Message

bmc.batt.id

Description

This message occurs when the Baseboard Management Controller (BMC)


cannot read the part number information stored in the battery configuration
firmware.

Corrective action Contact technical support for the current procedure to determine whether the
battery failed.

bmc.batt.invalid
Message

bmc.batt.invalid

Description

This message occurs when the Baseboard Management Controller (BMC)


determines that the battery installed is not the correct model for your system.

Corrective action Contact technical support to request the appropriate replacement battery for
your model of system.

bmc.batt.mfg
Message

bmc.batt.mfg

Description

This message occurs when the Baseboard Management Controller (BMC)


cannot read the manufacturer information stored in the battery configuration
firmware.

BMC messages | 341


Corrective action Contact technical support for the current procedure to determine whether the
battery failed.

bmc.batt.rev
Message

bmc.batt.rev

Description

This message occurs when the Baseboard Management Controller (BMC)


cannot read the revision code stored in the battery configuration firmware.

Corrective action Contact technical support for the current procedure to determine whether the
battery failed.

bmc.batt.seal
Message

bmc.batt.seal

Description

This message occurs when the Baseboard Management Controller (BMC)


cannot seal the battery's configuration information after a battery upgrade.

Corrective action Contact technical support for the current procedure to determine whether the
battery failed.

bmc.batt.unknown
Message

bmc.batt.unknown

Description

This message occurs when the Baseboard Management Controller (BMC)


determines that the installed battery is not a recognized part that is approved for
use in your system.

Corrective action Contact technical support to request the appropriate replacement battery for
your model of system.

bmc.batt.unseal
Message

bmc.batt.unseal

Description

This message occurs when the Baseboard Management Controller (BMC)


cannot unseal the battery's configuration information to determine whether the
battery firmware requires an upgrade.

Corrective action Contact technical support for the current procedure to determine whether the
battery failed.

bmc.batt.upgrade
Message

bmc.batt.upgrade

342 | Hardware Platform Monitoring Guide


Description

This message occurs when the Baseboard Management Controller (BMC)


generates it before an upgrade of the battery's configuration firmware to
indicate to the user the present and new revisions of battery configuration.

Corrective action None.

bmc.batt.upgrade.busy
Message

bmc.batt.upgrade.busy

Description

This message occurs when the Baseboard Management Controller (BMC)


determines that the battery configuration firmware requires an upgrade, but that
the BMC is too busy to perform the upgrade.

Corrective
action

It is normal to get this message one time after a BMC upgrade. However, if this
message is issued more than once, it indicates a problem with your system.
Contact technical support for the current procedure to determine whether your
system needs to be replaced.

bmc.batt.upgrade.failed
Message

bmc.batt.upgrade.failed

Description

This message occurs when the Baseboard Management Controller (BMC) cannot
upgrade the battery configuration firmware to the latest revision.

Corrective
action

In most cases, this error does not impact the functionality of your system, but
replacing the battery might be advised at your next maintenance window.
Contact technical support for the current procedure to determine whether the
battery needs to be replaced.

bmc.batt.upgrade.failure
Message

bmc.batt.upgrade.failure

Description

This message occurs when the Baseboard Management Controller (BMC)


generates it for every configuration item in the battery configuration firmware
that could not be updated during a battery upgrade.

Corrective
action

1. Remove and reinsert the controller module. In most cases, this forces the
BMC to reattempt and successfully upgrade the battery.
2. If you see this message more than once, contact technical support for the
current procedure to determine whether the battery needs to be replaced.

BMC messages | 343

bmc.batt.upgrade.ok
Message

bmc.batt.upgrade.ok

Description

This message occurs when the entire battery upgrade process is complete.

Corrective action

None.

bmc.batt.upgrade.power-off
Message

bmc.batt.upgrade.power-off

Description

This message occurs in the rare event where the Baseboard Management
Controller (BMC) cannot turn on system power, and the battery has not been
checked to determine whether it requires a configuration upgrade.

Corrective
action

1. Remove and reinsert the controller module.


2. If you continue to see this message, contact technical support for the current
procedure to determine whether the controller module needs to be replaced.

bmc.batt.upgrade.voltagelow
Message

bmc.batt.upgrade.voltagelow

Description

This message occurs when the Baseboard Management Controller (BMC)


generates it because the battery is discharged to below 6.0V and the battery
requires a configuration firmware update.

Corrective
action

This message is printed every 10 minutes until the battery is recharged. If you
continue to see this message after one hour, contact technical support for the
current procedure to determine whether the battery needs to be replaced.

bmc.batt.voltage
Message

bmc.batt.voltage

Description

This message occurs in the rare event where the Baseboard Management
Controller (BMC) determines that the battery configuration firmware requires
an update and the battery is successfully prepared for the update, but the BMC
cannot read the battery voltage sensor.

Corrective
action

Contact technical support for the current procedure to determine whether the
battery needs to be replaced.

344 | Hardware Platform Monitoring Guide

bmc.config.asup.off
Message

bmc.config.asup.off

Description

This message occurs in the rare event that the Baseboard Management
Controller (BMC) detects corruption in the BMC's internal cached copy of the
AutoSupport mail host and/or configured destinations. AutoSupport messages
from the BMC are disabled until the system boots.

Corrective
action

Boot the system to ensure that the BMC's cache of the AutoSupport
configuration is correct.

bmc.config.corrupted
Message

bmc.config.corrupted

Description

This message occurs in the rare event that the Baseboard Management Controller
(BMC) internal configuration is corrupted and is being reset to defaults. Notably,
the SSH service on the BMC LAN interface is disabled until the system boots.

Corrective
action

1. Boot the system. Upon boot, the Secure Shell (SSH) host keys for the BMC
are regenerated. The previous host keys for the BMC are no longer valid and
cannot be used for logins.
2. Contact technical support to determine whether your system needs
maintenance.

bmc.config.default
Message

bmc.config.default

Description

This message occurs in the rare event that the Baseboard Management Controller
(BMC) internal configuration is corrupted and is being reset to defaults. Notably,
the Secure Shell (SSH) service on the BMC LAN interface is disabled until the
system boots.

Corrective
action

1. Boot the system. Upon boot, the SSH host keys for the BMC are regenerated.
The previous host keys for the BMC are no longer valid and cannot be used
for logins.
2. Contact technical support to determine whether your system needs
maintenance.

bmc.config.default.pef.filter
Message

bmc.config.default.pef.filter

BMC messages | 345


Description

This message occurs in the rare event that the Baseboard Management Controller
(BMC) internal configuration is corrupted and is being reset to defaults. Notably,
the BMC's Platform Event Filter (PEF) tables are being cleared to factory defaults.

Corrective
action

Most users need to take no action. However, if you want to use custom Intelligent
Platform Management Interface (IPMI) PEF tables, you need to reenable the
BMC IPMI LAN interface, and reload any custom PEF tables that might be
defined for your site.

bmc.config.default.pef.policy
Message

bmc.config.default.pef.policy

Description

This message occurs in the rare event that the Baseboard Management Controller
(BMC) internal configuration is corrupted and is being reset to defaults. Notably,
the BMC's Platform Event Filter (PEF) tables are being cleared to factory defaults.

Corrective
action

Most users need to take no action. However, if you want to use custom IPMI PEF
tables, you need to reenable the BMC Intelligent Platform Management Interface
(IPMI) LAN interface, and reload any custom PEF tables that might be defined for
your site.

bmc.config.fru.systemserial
Message

bmc.config.fru.systemserial

Description

This message occurs when the Baseboard Management Controller (BMC)


detects an invalid System Serial Number field in the systems field-replaceable
unit (FRU) configuration area.

Corrective action Contact technical support to determine the maintenance procedure for your
system.

bmc.config.mac.error
Message

bmc.config.mac.error

Description

This message occurs when the Baseboard Management Controller (BMC)


Ethernet Media Access Control (MAC) identifier is invalid.

Corrective action Contact technical support to determine the corrective procedure for your
system.

bmc.config.net.error
Message

bmc.config.net.error

346 | Hardware Platform Monitoring Guide


Description

This message occurs when the Baseboard Management Controller (BMC)


cannot start networking support on the BMC LAN interface.

Corrective action Contact technical support to determine the corrective procedure for your
system.

bmc.config.upgrade
Message

bmc.config.upgrade

Description

This message occurs when the Baseboard Management Controller (BMC)


internal configuration defaults are updated.

Corrective action None.

bmc.power.on.auto
Message

bmc.power.on.auto

Description

This message occurs when, upon power up, the Baseboard Management
Controller (BMC) detects that the system was previously soft powered-off.

Corrective action None.

bmc.reset.ext
Message

bmc.reset.ext

Description

This message occurs when the Baseboard Management Controller (BMC)


detects that a bmc reboot command was issued on the system previously.

Corrective action None.

bmc.reset.int
Message

bmc.reset.int

Description

This message occurs when the Baseboard Management Controller (BMC) was
reset through the BMC command sequence ngs smash; set reboot=1;
priv set diag.

Corrective action None.

bmc.reset.power
Message

bmc.reset.power

Description

This message occurs when the Baseboard Management Controller (BMC)


detects a system power up, or after the BMC is upgraded.

BMC messages | 347


Corrective action None.

bmc.reset.repair
Message

bmc.reset.repair

Description

This message occurs when the Baseboard Management Controller (BMC)


detects and corrects an internal BMC error.

Corrective action If you receive this message frequently, contact technical support to determine
the corrective procedure for your system.

bmc.reset.unknown
Message

bmc.reset.unknown

Description

This message occurs when the Baseboard Management Controller (BMC)


cannot determine why it was reset.

Corrective action This message usually indicates a BMC internal error. Contact technical support
to determine the corrective procedure for your system.

bmc.sensor.batt.charger.off
Message

bmc.sensor.batt.charger.off

Description

This message occurs when the Baseboard Management Controller (BMC)


detects that the battery charger cannot be disabled for the hourly battery load
test.

Corrective action Contact technical support to determine the corrective procedure for your
system.

bmc.sensor.batt.charger.on
Message

bmc.sensor.batt.charger.on

Description

This message occurs when the Baseboard Management Controller (BMC)


cannot reenable the battery charger after the hourly battery load test.

Corrective action Contact technical support to determine the corrective procedure for your
system.

bmc.sensor.batt.time.run.invalid
Message

bmc.sensor.batt.time.run.invalid

348 | Hardware Platform Monitoring Guide


Description

This message occurs when the Baseboard Management Controller (BMC)


detects that the battery's calculated run time differs substantially from the
battery's run-time sensor.

Corrective action None.

bmc.ssh.key.missing
Message

bmc.ssh.key.missing

Description

This message occurs when the Baseboard Management Controller (BMC)


detects that the Secure Shell (SSH) host keys for the BMC are corrupted or
missing.

Corrective action Reboot the system. The boot sequence regenerates the host key and makes the
BMC SSH service available again.

349

Additional LED error conditions


LEDs enable you to monitor your storage system and its components. In specific circumstances,
LEDs can indicate conditions on the system that are not described elsewhere in this guide.
Each storage system platform has LEDs on the chassis, controller, fans, and PSUs. These LEDs
provide a high-level view of the status of your system and network activity. In some instances, an
LED might indicate a configuration problem on the system.

Clearing the fault LED when software is licensed but not


enabled
In some circumstances, the storage controller front panel fault LED illuminates when a software
license is installed but the protocol is not enabled.
About this task

On specific platform models running Data ONTAP operating in 7-Mode, if a protocol license is
installed but not enabled, the storage controller fault LED illuminates, yet no error message appears
in /etc/messages, EMS, or AutoSupport. This is known to affect FAS2020 and FAS2050 systems
running Data ONTAP 7.3.1 or 7.3.6RC1; FAS2040 systems running Data ONTAP 8.1RC1; and 3210
systems running Data ONTAP 8.0.1P2.
Steps

1. Determine which protocol licenses are installed on the system by using the license command.
Because protocol licenses should match on both nodes in an HA pair, be sure to check each node.
2. For each protocol license installed, confirm that the protocol is enabled and configured:

In the following example, NFS is configured and properly enabled:


Node1> nfs status
NFS server is running.

In the following example, CIFS is licensed but not configured:


Node1> cifs
CIFS not configured.

Use "cifs setup" to configure

3. If necessary, refer to the Data ONTAP File Access and Protocols Management Guide for 7-Mode
for directions on configuring the protocol.

350 | Hardware Platform Monitoring Guide

Copyright information
Copyright 19942014 NetApp, Inc. All rights reserved. Printed in the U.S.
No part of this document covered by copyright may be reproduced in any form or by any means
graphic, electronic, or mechanical, including photocopying, recording, taping, or storage in an
electronic retrieval systemwithout prior written permission of the copyright owner.
Software derived from copyrighted NetApp material is subject to the following license and
disclaimer:
THIS SOFTWARE IS PROVIDED BY NETAPP "AS IS" AND WITHOUT ANY EXPRESS OR
IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE,
WHICH ARE HEREBY DISCLAIMED. IN NO EVENT SHALL NETAPP BE LIABLE FOR ANY
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
NetApp reserves the right to change any products described herein at any time, and without notice.
NetApp assumes no responsibility or liability arising from the use of products described herein,
except as expressly agreed to in writing by NetApp. The use or purchase of this product does not
convey a license under any patent rights, trademark rights, or any other intellectual property rights of
NetApp.
The product described in this manual may be protected by one or more U.S. patents, foreign patents,
or pending applications.
RESTRICTED RIGHTS LEGEND: Use, duplication, or disclosure by the government is subject to
restrictions as set forth in subparagraph (c)(1)(ii) of the Rights in Technical Data and Computer
Software clause at DFARS 252.277-7103 (October 1988) and FAR 52-227-19 (June 1987).

351

Trademark information
NetApp, the NetApp logo, Network Appliance, the Network Appliance logo, Akorri,
ApplianceWatch, ASUP, AutoSupport, BalancePoint, BalancePoint Predictor, Bycast, Campaign
Express, ComplianceClock, Customer Fitness, Cryptainer, CryptoShred, CyberSnap, Data Center
Fitness, Data ONTAP, DataFabric, DataFort, Decru, Decru DataFort, DenseStak, Engenio, Engenio
logo, E-Stack, ExpressPod, FAServer, FastStak, FilerView, Fitness, Flash Accel, Flash Cache, Flash
Pool, FlashRay, FlexCache, FlexClone, FlexPod, FlexScale, FlexShare, FlexSuite, FlexVol, FPolicy,
GetSuccessful, gFiler, Go further, faster, Imagine Virtually Anything, Lifetime Key Management,
LockVault, Manage ONTAP, Mars, MetroCluster, MultiStore, NearStore, NetCache, NOW (NetApp
on the Web), Onaro, OnCommand, ONTAPI, OpenKey, PerformanceStak, RAID-DP, ReplicatorX,
SANscreen, SANshare, SANtricity, SecureAdmin, SecureShare, Select, Service Builder, Shadow
Tape, Simplicity, Simulate ONTAP, SnapCopy, Snap Creator, SnapDirector, SnapDrive, SnapFilter,
SnapIntegrator, SnapLock, SnapManager, SnapMigrator, SnapMirror, SnapMover, SnapProtect,
SnapRestore, Snapshot, SnapSuite, SnapValidator, SnapVault, StorageGRID, StoreVault, the
StoreVault logo, SyncMirror, Tech OnTap, The evolution of storage, Topio, VelocityStak, vFiler,
VFM, Virtual File Manager, VPolicy, WAFL, Web Filer, and XBB are trademarks or registered
trademarks of NetApp, Inc. in the United States, other countries, or both.
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business
Machines Corporation in the United States, other countries, or both. A complete and current list of
other IBM trademarks is available on the web at www.ibm.com/legal/copytrade.shtml.
Apple is a registered trademark and QuickTime is a trademark of Apple, Inc. in the United States
and/or other countries. Microsoft is a registered trademark and Windows Media is a trademark of
Microsoft Corporation in the United States and/or other countries. RealAudio, RealNetworks,
RealPlayer, RealSystem, RealText, and RealVideo are registered trademarks and RealMedia,
RealProxy, and SureStream are trademarks of RealNetworks, Inc. in the United States and/or other
countries.
All other brands or products are trademarks or registered trademarks of their respective holders and
should be treated as such.
NetApp, Inc. is a licensee of the CompactFlash and CF Logo trademarks.
NetApp, Inc. NetCache is certified RealSystem compatible.

352 | Hardware Platform Monitoring Guide

How to send your comments


You can help us to improve the quality of our documentation by sending us your feedback.
Your feedback is important in helping us to provide the most accurate and high-quality information.
If you have suggestions for improving this document, send us your comments by email to
doccomments@netapp.com. To help us direct your comments to the correct division, include in the
subject line the product name, version, and operating system.
You can also contact us in the following ways:

NetApp, Inc., 495 East Java Drive, Sunnyvale, CA 94089 U.S.


Telephone: +1 (408) 822-6000
Fax: +1 (408) 822-4501
Support telephone: +1 (888) 463-8277

Index | 353

Index
0200: Failure Fixed Disk
error message 157, 166
0230: System RAM Failed at offset
error message 158
0231: Shadow RAM failed at offset
error message 158
0232: Extended RAM failed at address line
error message 159
0235: Multiple-bit ECC error occurred
error message 159
023C: Bad DIMM found in slot #
error message 159
023E: Node Memory Interleaving disabled
error message 160
0241: Agent Read Timeout
error message 160
0242: Invalid FRU information
error message 161
0250: System battery is dead
error message 161
0251: System CMOS checksum bad
error message 162
0253: Clear CMOS jumper detected
error message 162
0260: System timer error
error message 162
0280: Previous boot incomplete
error message 162
02C2: No valid Boot Loader in System FlashNon Fatal
error message 163
02C3: No valid Boot Loader in System FlashFatal
error message 163
02F9: FPGA jumper detected
POST error message 163
02FA: Watchdog Timer Reboot (PciInit)
error message 164
02FB: Watchdog Timer Reboot (MemTest)
POST error message 164
02FC: LDTStop Reboot (HTLinkInit)
eror message 165
20xx systems
controller module fault LED 32
controller module LEDs 30
Ethernet port LEDs 32
fault LED 30
Fibre Channel port LEDs 32

LEDs on the back of the controller module 32


LEDs on the front of the chassis 30
NVMEM LED 32
power LED 30
PSU LEDs 34
remote management port LEDs 32
22xx system POST error messages
0231: Shadow RAM Failed at offset 166
22xx systems
chassis fault LED 35
controller activity LED 35
controller fault LED 36
Fibre Channel port LEDs 36
GbE port LEDs 36
internal drive LEDs 39
internal FRU LEDs 43
introduction to LEDs on 35
introduction to POST error messages 166
LEDs on the back of the controller 36
LEDs on the front of the chassis 35
management port LEDs 36
mezzanine card 36
NVMEM LED 36
power LED 35
PSU LEDs 41
SAS port LEDs 36
serial port 36
2520 systems
10/100/1000Base-T port LEDs 47
1000Base-T port LEDs 47
10GBase-T port LEDs 47
controller attention LED 47
LEDs on the back of the controller 47
management port LEDs 47
NVMEM LED 47
SAS port LEDs 47
2520, 2552, and 2554 systems
chassis attention LED 43
controller activity LED 43
LEDs on the front of the chassis 43
power LED 43
255x systems
10-GbE port LEDs 50
10/100/1000Base-T port LEDs 50
1000Base-T port LEDs 50
controller attention LED 50

354 | Hardware Platform Monitoring Guide


LEDs on the back of the controller 50
management port LEDs 50
NVMEM LED 50
SAS port LEDs 50
UTA2/CNA port LEDs 50
25xx system POST error messages
0231: Shadow RAM Failed at offset 166
25xx systems
internal drive LEDs 45
introduction to LEDs on 43
introduction to POST error messages 166
location and meaning of internal FRU LEDs 55
PSU LEDs 53
31xx systems
controller activity LED 60
fan LED 63
fault LED 60
fault LEDs 62
FRU LEDs 65
introduction to POST error messages 157
LEDs on the front of the chassis 60
location and meaning of Ethernet port LEDs 62
location and meaning of FC port LED 62
location and meaning of LEDs on the back of the
controller 62
power LED 60
PSU LEDs 63
32xx system POST error messages
0231: Shadow RAM Failed at offset 166
32xx systems
chassis fault LED 65
controller activity LED 65
controller fault LED 66
controller-I/O expansion module configuration 65
dual-controller configuration 65
fan LED 70
Fibre Channel port LEDs 66
GbE port LEDs 66
I/O expansion module fault LED 69
internal FRU LEDs 72
introduction to POST error messages 166
LED on the back of the I/O expansion module 69
LEDs on the back of the controller 66
LEDs on the front of the chassis 65
management port LEDs 66, 69
NVMEM LED 66
power LED 65
PSU LEDs 71
SAS port LEDs 66
serial port 66

60xx system error messages


02F9: FPGA jumper detected 163
60xx system POST error messages
02FB: Watchdog Timer Reboot (MemTest) 164
60xx systems
activity LED 73
fan LEDs 75
Fibre Channel port LEDs 74
GbE port LEDs 74
introduction to POST error messages 157
LEDs on the back of the controller 74
LEDs on the front of the controller 73
location and meaning of PSU LEDs 76
power LED 73
RLM LEDs 74
status LED 73
62xx system POST error messages
0231: Shadow RAM Failed at offset 166
62xx systems
10-GbE port LEDs 79
8-Gb Fibre Channel port LEDs 79
chassis fault LED 77
controller activity LED 77
controller fault LED 79
controller-I/O expansion module configuration 77
dual-controller configuration 77
fan LEDs 84
GbE port LEDs 79
I/O expansion module fault LED 83
internal FRU LEDs 85
introduction to POST error messages 166
LEDs on the back of the controller 79
LEDs on the back of the I/O expansion module 83
power LED 77
private management port LEDs 79, 83
PSU LEDs 84
remote management port LEDs 79
serial port 79
USB port 79
7-Mode error messages
UTA2 (CNA) 306
8020 systems
10/100/1000Base-T port LEDs 89
10GbE port LEDs 89
chassis attention LED 86
controller activity LED 86
controller attention LED 89
dual-controller configuration 86
Fibre Channel port LEDs 89
LEDs on the back of the controller 89

Index | 355
location and meaning of fan LEDs 98
location and meaning of internal FRU LEDs 100
management port LEDs 89
NVRAM LED 89
power LED 86
SAS port LEDs 89
UTA2/CNA port LEDs 89
8040, 8060, and 8080 systems
1000Base-T port LEDs 92
10GbE port LEDs 92
controller attention LED 92
Fibre Channel port LEDs 92
LEDs on the back of the controller 92
location and meaning of fan LEDs 98
location and meaning of internal FRU LEDs 101
management port LEDs 92
NVRAM LED 92
SAS port LEDs 92
UTA2/CNA port LEDs 92
80xx system POST error messages
0231: Shadow RAM Failed at offset 166
80xx systems
chassis attention LED 88
controller activity LED 88
dual-controller configuration 88
HA interconnect port LEDs 96
HA interconnect ports 96
I/O expansion module fault LED 96
introduction to LEDs on 86
introduction to POST error messages 166
LEDs on the back of the I/O expansion module 96
location and meaning of PSU LEDs 99
power LED 88

A
AutoSupport messages 27

B
BIOS and boot loader progress
Method of viewing progress on the console 155
method of viewing progress through the Bios Status
sensor 156
BMC
e-mail contents 335
function 335
how and when e-mail AutoSupport messages are
sent 335
systems containing the 335

BMC-generated messages
BMC_ASUP_UNKNOWN 336
REBOOT (abnormal) 336
REBOOT (power loss) 336
REBOOT (watchdog reset) 336
SYSTEM_BOOT_FAILED (POST failed) 336
SYSTEM_POWER_OFF (environment) 337
USER_TRIGGERED (bmc test 337
USER_TRIGGERED (system nmi) 337
USER_TRIGGERED (system power cycle) 337
USER_TRIGGERED (system power off) 337
USER_TRIGGERED (system power on) 338
USER_TRIGGERED (system power soft-off) 338
USER_TRIGGERED (system reset) 338
boot devices
introduction to EMS messages for USB 296
Boot error messages
Boot device err 174
Cannot initialize labels 174
Cannot read labels 174
Configuration exceeds max PCI space 174
DIMM slot # has correctable ECC errors 175
Dirty shutdown in degraded mode 175
Disk label processing failed 175
Drive %s.%d not supported 175
Error detection detected too many errors to analyze
at once 175
FC-AL loop down, adapter %d 176
File system may be scrambled 176
Halted disk firmware too old 177
Halted: Illegal configuration 177
Invalid PCI card slot %d 177
No /etc/rc 177
No /etc/rc, running setup 178
No disk controllers 178
No disks 178
No network interfaces 178
No NVRAM present 178
NVRAM #n downrev 179
NVRAM: wrong pci slot 179
Panic: DIMM slot #n has uncorrectable ECC errors

179
This platform is not supported on this release 179
Too many errors in too short time 180
Warning: Motherboard Revision not available 180
Warning: Motherboard Serial Number not available
180
Warning: system serial number is not available 180
Watchdog error 180
Watchdog failed 180

356 | Hardware Platform Monitoring Guide

C
clustered system error messages
UTA2 (CNA) 308
comments
how to send feedback about documentation 352

D
diagnostic tools
boot_diags command 28
forms and use of 28
sldiag commands 28
doccomments
how to send feedback about documentation by using

352
documentation
how to send feedback about 352
where to find platform troubleshooting 28

E
EMS messages
Chassis power supply removed: PS# 187
information provided in 182
introduction to environmental 182
No network interfaces 178
rlm.firmware.update.failed 327
ses.access.noMoreValidPaths 264
ses.access.noShelfSES 265
ses.disk.configOk 268
ses.download.shelfToReboot 269
ses.drive.shelfAddr.mm 270
EMS messages about the BMC
bmc.asup.crit 338
bmc.asup.error 339
bmc.asup.init 339
bmc.asup.queue 339
bmc.asup.send 339
bmc.asup.smtp 340
bmc.batt.id 340
bmc.batt.invalid 340
bmc.batt.mfg 340
bmc.batt.rev 341
bmc.batt.seal 341
bmc.batt.unknown 341
bmc.batt.unseal 341
bmc.batt.upgrade 341
bmc.batt.upgrade.busy 342
bmc.batt.upgrade.failed 342

bmc.batt.upgrade.failure 342
bmc.batt.upgrade.ok 343
bmc.batt.upgrade.power-off 343
bmc.batt.upgrade.voltagelow 343
bmc.batt.voltage 343
bmc.config.asup.off 344
bmc.config.corrupted 344
bmc.config.default 344
bmc.config.default.pef.filter 344
bmc.config.default.pef.policy 345
bmc.config.fru.systemserial 345
bmc.config.mac.error 345
bmc.config.net.error 345
bmc.config.upgrade 346
bmc.power.on.auto 346
bmc.reset.ext 346
bmc.reset.int 346
bmc.reset.power 346
bmc.reset.repair 347
bmc.reset.unknown 347
bmc.sensor.batt.charger.off 347
bmc.sensor.batt.charger.on 347
bmc.sensor.batt.time.run.invalid 347
bmc.ssh.key.missing 348
EMS messages about the RLM
rlm.driver.hourly.stats 325
rlm.driver.mailhost 326
rlm.driver.network.failure 326
rlm.driver.timeout 326
rlm.firmware.upgrade.reqd 328
rlm.firmware.version.unsupported 328
rlm.heartbeat.bootFromBackup 329
rlm.heartbeat.resumed 329
rlm.heartbeat.stopped 329
rlm.network.link.down 330
rlm.notConfigured 330
rlm.orftp.failed 331
rlm.snmp.traps.off 331
rlm.systemDown.alert 331
rlm.systemDown.notice 332
rlm.systemDown.warning 332
rlm.systemPeriodic.keepAlive 333
rlm.systemTest.notice 333
rlm.userlist.update.failed 334
EMS messages about the SP
sp.firmware.upgrade.reqd 314
sp.firmware.version.unsupported 315
sp.heartbeat.resumed 315
sp.heartbeat.stopped 315
sp.network.link.down 316

Index | 357
sp.notConfigured 316
sp.orftp.failed 317
sp.snmp.traps.off 317
sp.userlist.update.failed 317
spmgmt.driver.hourly.stats 318
spmgmt.driver.mailhost 319
spmgmt.driver.network.failure 319
spmgmt.driver.timeout 319
environmental EMS messages
Chassis fan FRU failed 182
Chassis over temperature on XXXX 183
Chassis over temperature shutdown on XXXX 183
Chassis Power Degraded: 3.3V in warn high state

183
Chassis power degraded: PS# 184
Chassis Power Fail: PS# 184
Chassis Power Shutdown 184
Chassis power shutdown: 3.3V is in warn low state
185
Chassis power supply degraded: PS# 186
Chassis power supply fail: PS# 186
Chassis power supply off: PS# 186, 187
Chassis power supply OK: PS# 187
Chassis power supply removed: PS# 187
Chassis Power Supply: PS# removed 185
Chassis under temperature on XXXX 188
Chassis under temperature shutdown on XXXX 188
Fan: # is spinning below tolerable speed 188
introduction to 182
monitor.chassisFan.degraded 189
monitor.chassisFan.ok 189
monitor.chassisFan.removed 189
monitor.chassisFan.slow 189
monitor.chassisFan.stop 190
monitor.chassisFan.warning 190
monitor.chassisFanFail.xMinShutdown 190
monitor.chassisPower.degraded 190
monitor.chassisPower.ok 191
monitor.chassisPowerSupplies.ok 191
monitor.chassisPowerSupply.degraded 191
monitor.chassisPowerSupply.notPresent 191
monitor.chassisPowerSupply.off 192
monitor.chassisPowerSupply.ok 192
monitor.chassisTemperature.cool 192
monitor.chassisTemperature.ok 192
monitor.chassisTemperature.warm 192
monitor.cpuFan.degraded 193
monitor.cpuFan.failed 193
monitor.cpuFan.ok 193
monitor.ioexpansion.unpresent 194

monitor.ioexpansionPower.degraded 194
monitor.ioexpansionPower.ok 194
monitor.ioexpansionTemperature.cool 194
monitor.ioexpansionTemperature.ok 195
monitor.ioexpansionTemperature.warm 195
monitor.nvmembattery.warninglow 195
monitor.nvramLowBattery 195
monitor.power.unreadable 196
monitor.shutdown.cancel 196
monitor.shutdown.cancel.nvramLowBattery 196
monitor.shutdown.chassisOverTemp 196
monitor.shutdown.chassisUnderTemp 197
monitor.shutdown.emergency 197
monitor.shutdown.ioexpansionOverTemp 197
monitor.shutdown.nvramLowBattery.pending 197
monitor.temp.unreadable 198
Multiple chassis fans have failed 198
Multiple fan failure on XXXX 198
Multiple power supply fans failed 199
nvmem.battery.capacity.low 199
nvmem.battery.capacity.low.warn 199
nvmem.battery.capacity.normal 200
nvmem.battery.current.high 200
nvmem.battery.current.high.warn 200
nvmem.battery.sensor.unreadable 200
nvmem.battery.temp.high 201
nvmem.battery.temp.low 201
nvmem.battery.temp.normal 201
nvmem.battery.voltage.high 202
nvmem.battery.voltage.high.warn 202
nvmem.battery.voltage.normal 202
nvmem.voltage.high 202
nvmem.voltage.high.warn 203
nvmem.voltage.normal 203
nvram.bat.missing.error 203
nvram.battery.capacity.low 203
nvram.battery.capacity.low.critical 204
nvram.battery.capacity.low.warn 204
nvram.battery.capacity.normal 204
nvram.battery.charging.nocharge 204
nvram.battery.charging.normal 205
nvram.battery.charging.wrongcharge 205
nvram.battery.current.high 205
nvram.battery.current.high.warn 206
nvram.battery.current.low 206
nvram.battery.current.low.warn 206
nvram.battery.current.normal 206
nvram.battery.end_of_life.high 207
nvram.battery.end_of_life.normal 207
nvram.battery.fault 207

358 | Hardware Platform Monitoring Guide


nvram.battery.fault.warn 207
nvram.battery.fcc.low 208
nvram.battery.fcc.low.critical 208
nvram.battery.fcc.low.warn 208
nvram.battery.fcc.normal 208
nvram.battery.power.fault 209
nvram.battery.power.normal 209
nvram.battery.sensor.unreadable 209
nvram.battery.temp.high 210
nvram.battery.temp.high.warn 210
nvram.battery.temp.low 210
nvram.battery.temp.normal 211
nvram.battery.voltage.high 211
nvram.battery.voltage.high.warn 211
nvram.battery.voltage.low 211
nvram.battery.voltage.low.warn 212
nvram.battery.voltage.normal 212
nvram.hw.initFail 212
error messages
0200: Failure Fixed Disk 157, 166
0230: System RAM Failed at offset 158
0231: Shadow RAM failed at offset 158
0231: Shadow RAM Failed at offset 166
0232: Extended RAM failed at address line 159
0235: Multiple-bit ECC error occurred 159
023C: Bad DIMM found in slot # 159
023E: Node Memory Interleaving disabled 160
0241: Agent Read Timeout 160
0242: Invalid FRU information 161
0250: System battery is dead 161
0251: System CMOS checksum bad 162
0253: Clear CMOS jumper detected 162
0260: System timer error 162
0280: Previous boot incomplete 162
02C2: No valid Boot Loader in System FlashNon
Fatal 163
02C3: No valid Boot Loader in System FlashFatal

163
02FA: Watchdog Timer Reboot (PciInit) 164
02FC: LDTStop Reboot (HTLinkInit) 165
No message on console 165
UTA2 (CNA) maintenance mode and 7-Mode 306
UTA2 (CNA), for clustered systems 308

F
FAS20xx systems
startup progress, viewing 155
FAS2520 systems
10/100/1000Base-T port LEDs 47

1000Base-T port LEDs 47


10GBase-T port LEDs 47
controller attention LED 47
LEDs on the back of the controller 47
management port LEDs 47
NVMEM LED 47
SAS port LEDs 47
FAS2520, FAS2552, and FAS2554 systems
chassis attention LED 43
controller activity LED 43
LEDs on the front of the chassis 43
power LED 43
FAS255x systems
10-GbE port LEDs 50
10/100/1000Base-T port LEDs 50
1000Base-T port LEDs 50
controller attention LED 50
LEDs on the back of the controller 50
management port LEDs 50
NVMEM LED 50
SAS port LEDs 50
UTA2/CNA port LEDs 50
FAS25xx systems
internal drive LEDs 45
introduction to LEDs on 43
location and meaning of internal FRU LEDs 55
PSU LEDs 53
FAS8020 systems
10/100/1000Base-T port LEDs 89
10GbE port LEDs 89
chassis attention LED 86
controller activity LED 86
controller attention LED 89
dual-controller configuration 86
Fibre Channel port LEDs 89
LEDs on the back of the controller 89
location and meaning of fan LEDs 98
location and meaning of internal FRU LEDs 100
management port LEDs 89
NVRAM LED 89
power LED 86
SAS port LEDs 89
UTA2/CNA port LEDs 89
FAS8040, FAS8060, and FAS8080 systems
1000Base-T port LEDs 92
10GbE port LEDs 92
controller attention LED 92
Fibre Channel port LEDs 92
LEDs on the back of the controller 92
location and meaning of fan LEDs 98

Index | 359
location and meaning of internal FRU LEDs 101
management port LEDs 92
NVRAM LED 92
SAS port LEDs 92
UTA2/CNA port LEDs 92
FAS80xx systems
chassis attention LED 88
controller activity LED 88
dual-controller configuration 88
HA interconnect port LEDs 96
HA interconnect ports 96
I/O expansion module fault LED 96
introduction to LEDs on 86
LEDs on the back of the I/O expansion module 96
location and meaning of PSU LEDs 99
power LED 88
FCoE HBA EMS messages
ispcna.mpi.dump 213
ispcna.mpi.dump.saved 213
ispcna.mpi.initFailed 213
FCVI adapter LEDs
introduction to 142
location and meaning of dual-port, 16-Gb 147
location and meaning of dual-port, 2-Gb 142
location and meaning of dual-port, 4-Gb 143
location and meaning of dual-port, 8-Gb 145
feedback
how to send comments about documentation 352
Flash Cache module and PAM EMS messages
callhome.flash.cache.failed 214
extCache.io.BlockChecksumError 214
extCache.io.cardError 214
extCache.io.readError 215
extCache.io.writeError 215
extCache.offline 215
extCache.ReconfigComplete 215
extCache.ReconfigFailed 216
extCache.ReconfigStart 216
extCache.UECCerror 216
extCache.UECCmax 217
fal.chan.offline.comp 217
fal.chan.online.erase.warn 217
fal.chan.online.fail 217
fal.chan.online.read.warn 218
fal.chan.online.rep.fail 218
fal.chan.online.rep.part 218
fal.chan.online.rep.succ 219
fal.chan.online.rep.ver.err 219
fal.chan.online.write.warn 219
fal.init.failed 219

fmm.bad.block.detected 219
fmm.device.stats.missing 220
fmm.domain.card.failure 220
fmm.domain.core.failure 220
fmm.domain.lun.failure 220
fmm.hourly.device.report 221
fmm.log.bb 221
fmm.threshold.bank.degraded 221
fmm.threshold.bank.offline 221
fmm.threshold.card.degraded 222
fmm.threshold.card.failure 222
fmm.threshold.core.offline 222
fmm.threshold.lun.offline 222
iomem.bbm.bbtl.overflow 223
iomem.bbm.new.flash 223
iomem.card.disable 223
iomem.card.enable 224
iomem.card.fail.cecc 224
iomem.card.fail.data.crc 224
iomem.card.fail.desc.crc 224
iomem.card.fail.dimm 223, 225
iomem.card.fail.firmware.primary 225
iomem.card.fail.fpga 225
iomem.card.fail.fpga.primary 226
iomem.card.fail.fpga.rev 226
iomem.card.fail.internal 227
iomem.card.fail.pci 227
iomem.card.fail.uecc 227
iomem.dimm.log.checksum 228
iomem.dimm.log.init 228
iomem.dimm.log.read 228
iomem.dimm.log.sync 228
iomem.dimm.log.write 228
iomem.dimm.mismatch.banks 229
iomem.dimm.mismatch.burst 229
iomem.dimm.mismatch.casLatency 229
iomem.dimm.mismatch.columns 229
iomem.dimm.mismatch.dataWidth 230
iomem.dimm.mismatch.eccWidth 230
iomem.dimm.mismatch.ranks 230
iomem.dimm.mismatch.rows 230
iomem.dimm.mismatch.vendor 231
iomem.dimm.spd.banks 231
iomem.dimm.spd.burst 231
iomem.dimm.spd.casLatency 231
iomem.dimm.spd.checksum 232
iomem.dimm.spd.columns 232
iomem.dimm.spd.dataWidth 232
iomem.dimm.spd.detect 232
iomem.dimm.spd.eccWidth 233

360 | Hardware Platform Monitoring Guide


iomem.dimm.spd.ranks 233
iomem.dimm.spd.read 233
iomem.dimm.spd.rows 233
iomem.dma.crc.data 234
iomem.dma.crc.desc 234
iomem.dma.internal 234
iomem.dma.stall 234
iomem.ecc.cecc 235
iomem.ecc.correct.off 235
iomem.ecc.correct.on 235
iomem.ecc.detect.off 235
iomem.ecc.detect.on 236
iomem.ecc.inject 236
iomem.ecc.summary 236
iomem.ecc.uecc 236
iomem.fail.stripe 237
iomem.firmware.package.access 237
iomem.firmware.primary 237
iomem.firmware.program.complete 237
iomem.firmware.program.fail 238
iomem.firmware.program.reboot 238
iomem.firmware.program.start 238
iomem.firmware.rev 238
iomem.flash.mismatch.id 239
iomem.fru.badInfo 239
iomem.fru.checksum 239
iomem.fru.read 239
iomem.fru.write 240
iomem.i2c.link.down 240
iomem.i2c.read.addrNACK 240
iomem.i2c.read.dataNACK 240
iomem.i2c.read.timeout 241
iomem.i2c.write.addrNACK 241
iomem.i2c.write.dataNACK 241
iomem.i2c.write.timeout 241
iomem.init.detect.fpga 241
iomem.init.detect.pci 242
iomem.init.fail 242
iomem.memory.flash.syndrome 242
iomem.memory.none 242
iomem.memory.power.high 243
iomem.memory.power.low 243
iomem.memory.scrub.start 243
iomem.memory.size 243
iomem.memory.zero.complete 244
iomem.memory.zero.start 244
iomem.nor.op.failed 244
iomem.pci.error.config.bar 244
iomem.pio.op.failed 244
iomem.remap.block 245

iomem.remap.target.bad 245
iomem.temp.report 245
iomem.train.complete 245
iomem.train.fail 246
iomem.train.notReady 246
iomem.train.start 246
iomem.vmargin.high 246
iomem.vmargin.low 246
iomem.vmargin.nominal 247
message generation and reporting 214
monitor.extCache.failed 247
monitor.flexscale.noLicense 247
Flash Cache modules
location and meaning of LEDs 128

H
HBA LEDs
dual-port, 3-Gb SAS 139
dual-port, 4-Gb, target-mode Fibre Channel 130
dual-port, 8-Gb, target-mode Fibre Channel 130
location and meaning of dual-port Fibre Channel 129
location and meaning of dual-port, 10-Gb, FCoE
CNA 112
location and meaning of dual-port, 10-Gb, FCoE
unified target 112
location and meaning of dual-port, 16-Gb FC, 10GbE/FCoE UTA2 115
location and meaning of fiber-optic iSCSI target 137
location and meaning of quad-port, 4-Gb, 12-LED
Fibre Channel 133
quad-port, 4-Gb, Fibre Channel, four-LED version

132
quad-port, 8-Gb, Fibre Channel, 12-LED version
135
HBA ports
location of quad-port, 3-Gb SAS 140
high-quality information
how to send feedback about improving
documentation 352

I
information
how to send feedback about improving
documentation 352
internal FRU LEDs
location and meaning of 8020 100
location and meaning of FAS8020 100

Index | 361

L
LEDs
10-GbE port 79
10/100/1000Base-T port LEDs 89
1000Base-T port LEDs 92
10GbE port LEDs 89, 92
20xx system LEDs on the back of the controller
module 32
20xx system LEDs on the front of the chassis 30
22xx internal drive LEDs 39
22xx system internal FRU LEDs 43
22xx system LEDs on the back of the controller 36
22xx system LEDs on the front of the chassis 35
22xx system PSU LEDs 41
2520 system LEDs on the back of the controller 47
2520, 2552, and 2554 system LEDs on the front of
the chassis 43
255x system LEDs on the back of the controller 50
25xx system PSU 53
31xx system fan LEDs 63
31xx system FRU LEDs 65
31xx system LEDS on the front of the chassis 60
31xx system PSU LEDs 63
32xx system fan LEDs 70
32xx system internal FRU LEDs 72
32xx system LED on the back of the I/O expansion
module 69
32xx system LEDs on the back of the controller 66
32xx system PSU LEDs 71
60xx system fan LEDs 75
60xx system LEDs on the back of the controller 74
60xx system LEDs on the front of the controller 73
62xx LEDs on the back of the controller 79
62xx PSU LEDs 84
62xx system fan LEDs 84
62xx system internal FRU LEDs 85
62xx system LEDs on front of chassis 77
62xx system LEDs on the back of the I/O expansion
module 83
8-Gb Fibre Channel port 79
8020 system LEDs on front of chassis 86
8020 system LEDs on the back of the controller 89
8040, 8060, and 8080 systems, on the back of the
controller 92
80xx system LEDs on front of chassis 88
80xx system LEDs on the back of the I/O expansion
module 96
chassis fault LED 65
controller activity LED 65

controller attention LED 89, 92


controller fault 79
controller fault LED 66
controller module 32
controller-controller configuration 65, 77, 86, 88
controller-I/O expansion module configuration 65,

77
dual-port, 3-Gb SAS 139
dual-port, 4-Gb, target-mode Fibre Channel HBA
130
dual-port, 8-Gb FCVI adapter 145
dual-port, 8-Gb, target-mode Fibre Channel HBA
130
Ethernet port 32
FAS2520 system LEDs on the back of the controller
47
FAS2520, FAS2552, and FAS2554 system LEDs on
the front of the chassis 43
FAS255x system LEDs on the back of the controller
50
FAS25xx system PSU 53
FAS8020 system LEDs on front of chassis 86
FAS8020 system LEDs on the back of the controller
89
FAS8040, FAS8060, and FAS8080 systems, on the
back of the controller 92
FAS80xx system LEDs on front of chassis 88
FAS80xx system LEDs on the back of the I/O
expansion module 96
fault, when protocol licensed but not enabled 349
Fibre Channel port 32
Fibre Channel port LEDs 66, 89, 92
FRU LEDs 65
GbE port 79
GbE port LEDs 66
HA interconnect ports on the back of the 80xx I/O
expansion module 96
HBA LEDs
location and meaning of copper iSCSI target
138
internal FRU LEDs 72, 85
introduction to 22xx system 35
introduction to 25xx 43
introduction to 80xx system 86
introduction to additional error conditions 349
introduction to FAS25xx 43
introduction to FAS80xx system 86
introduction to FCVI adapter 142
introduction to MetroCluster adapter 142
introduction to SA300 56

362 | Hardware Platform Monitoring Guide


LEDs on the back of the controller 74
LEDs on the back of the controller module 32
LEDs on the front of the chassis 60
LEDS on the front of the chassis 65
LEDs on the front of the controller 73
location and meaning of 2050 single-port 10-GbE
NIC 124
location and meaning of 25xx system internal FRU

55
location and meaning of 31xx system LEDs on the
back of the controller 62
location and meaning of 60xx system PSU 76
location and meaning of 8020 system fan LEDs 98
location and meaning of 8020 system internal FRU
100
location and meaning of 8040, 8060, and 8080
system fan LEDs 98
location and meaning of 8040, 8060, and 8080
system internal FRU 101
location and meaning of 80xx system PSU 99
location and meaning of copper iSCSI target HBA
138
location and meaning of dual-port 10-GbE NIC 125
location and meaning of dual-port Fibre Channel
HBA 129
location and meaning of dual-port GbE NICs 122,
123
location and meaning of dual-port, 10-Gb, FCoE
CNA HBA 112
location and meaning of dual-port, 10-Gb, FCoE
unified target HBA 112
location and meaning of dual-port, 10GBase-CX4
TOE NICs 152
location and meaning of dual-port, 10GBase-SR
TOE NIC 151
location and meaning of dual-port, 16-Gb FC, 10GbE/FCoE UTA2 115
location and meaning of dual-port, 16-Gb
MetroCluster adapter
dual-port, 16-Gb FCVI adapter 147
location and meaning of dual-port, 2-Gb
MetroCluster adapter 142
location and meaning of dual-port, 4-Gb
MetroCluster adapter 143
location and meaning of dual-port, 8-Gb
MetroCluster adapter 145
location and meaning of FAS25xx system internal
FRU 55
location and meaning of FAS8020 system fan LEDs
98

location and meaning of FAS8020 system internal


FRU 100
location and meaning of FAS8040, FAS8060, and
FAS8080 system fan LEDs 98
location and meaning of FAS8040, FAS8060, and
FAS8080 system internal FRU 101
location and meaning of FAS80xx system PSU 99
location and meaning of fiber-optic iSCSI target
HBA 137
location and meaning of Flash Cache module 128
location and meaning of multiport GbE NIC 118
location and meaning of PAM 127
location and meaning of PSU 76
location and meaning of quad-port, 4-Gb, 12-LED
Fibre Channel HBA 133
location and meaning of SA300 controller front 56
location and meaning of SA600 system PSU 76
location and meaning of single-port GbE NICs 116
location and meaning of, on the back of the
controller 62
management port LEDs 66, 89, 92
nonvolatile memory (NVMEM) 32
NVMEM LED 66
NVRAM LED 89, 92
NVRAM5 adapter 102
NVRAM5 and NVRAM6 media converter 103
NVRAM6 adapter 102
NVRAM7 adapter 103
NVRAM8 adapter 104
NVRAM9 adapter 109
on 25xx internal drive carriers 45
on FAS25xx internal drive carriers 45
on the back of the 80xx I/O expansion module 96
on the back of the controller 66, 89
on the back of the I/O expansion module 69
on the front of the chassis 30
onboard drive failures, 20xx systems 30
power LED 65
private management port 79
PSU 34, 41, 71
PSU LEDs 63, 84
PSU, 20xx systems 34
PSU, SA200 systems 34
quad-port TOE NICs 149
quad-port, 4-Gb, Fibre Channel HBA, four-LED
version 132
quad-port, 8-Gb, Fibre Channel HBA, 12-LED
version 135
remote management port 32, 79

Index | 363
SA200 system LEDs on the back of the controller
module 32
SA200 system LEDs on the front of the chassis 30
SA300 PSU 59
SA300 system fan 58
SA300 system LEDs on the back of the controller 57
SA320 system fan LEDs 70
SA320 system internal FRU LEDs 72
SA320 system LED on the back of the I/O expansion
module 69
SA320 system LEDs on the back of the controller 66
SA320 system PSU LEDs 71
SA600 system fan LEDs 75
SA600 system LEDs on the back of the controller 74
SA600 system LEDs on the front of the controller 73
SA620 LEDs on the back of the controller 79
SA620 PSU LEDs 84
SA620 system fan LEDs 84
SA620 system internal FRU LEDs 85
SA620 system LEDs on front of chassis 77
SA620 system LEDs on the back of the I/O
expansion module 83
SAS port LEDs 66, 89, 92
single-port TOE NICs 148
UTA2/CNA port LEDs 89, 92
visible from front of system 77
LEDs on the back of the I/O expansion module
I/O expansion module fault 83
private management port LEDs 83

M
maintenance mode error messages
UTA2 (CNA) 306
MetroCluster adapter LEDs
introduction to 142
location and meaning of dual-port, 16-Gb 147
location and meaning of dual-port, 2-Gb 142
location and meaning of dual-port, 4-Gb 143
location and meaning of dual-port, 8-Gb 145

N
NIC LEDs
location and meaning of 2050 single-port 10-GbE

124
location and meaning of dual-port 10-GbE 125
location and meaning of dual-port GbE 122, 123
location and meaning of dual-port, 10GBase-CX4
TOE 152

location and meaning of dual-port, 10GBase-SR


TOE 151
location and meaning of multiport GbE 118
location and meaning of single-port GbE 116
No message on console
error message 165
NVRAM5 adapter
LEDs 102
NVRAM6 adapter
LEDs 102
which systems support 101
NVRAM7 adapter
LEDs 103
which systems support 101
NVRAM8 adapter
destage status 104
HA pair 104
LEDs 104
which systems support 101
NVRAM9 adapter
LEDs 109
which systems support 101

O
operational error messages
Disk hung during swap 303
Disk n is broken 304
Dumping core 304
Error dumping core 304
FC-AL LINK_FAILURE 304
FC-AL RECOVERABLE ERRORS 304
information provided in 182
Panicking 305
RMC Alert: Boot Error 305
RMC Alert: Down Appliance 305
RMC Alert: OFW POST Error 305

P
platform troubleshooting
where to find documentation for 28
POST error messages
0200: Failure Fixed Disk 157, 166
0230: System RAM Failed at offset 158, 166
0231: Shadow RAM failed at offset 158
0231: Shadow RAM Failed at offset 166
0232: Extended RAM failed at address line 159
0232: Extended RAM Failed at address line 166
0235: Multiple-bit ECC error occurred 159

364 | Hardware Platform Monitoring Guide


023A: ONTAP Detected Bad DIMM in slot 167
023C: Bad DIMM found in slot # 159
023E: Node Memory Interleaving disabled 160, 167
0241: Agent Read Timeout 160
0241: SMBus Read Timeout 167
0242: Invalid FRU information 161, 167
0250: System battery is dead 161
0250: System battery is dead - Replace and run
SETUP 168
0251: System CMOS checksum bad 162, 168
0253: Clear CMOS jumper detected 162
0260: System timer error 162, 168
0271: Check date and time settings 168
0280: Previous boot incomplete 162
0280: Previous boot incomplete - Default
configuration used 169
02A3: No Response From SP To FRU ID Read
Request 169
02C2: No valid Boot Loader in System Flash - Non
Fatal 169
02C2: No valid Boot Loader in System FlashNon
Fatal 163
02C3: No valid Boot Loader in System Flash - Fatal

170
02C3: No valid Boot Loader in System FlashFatal
163
02F9: FPGA jumper detected 163
02FA: Watchdog Timer Reboot (PciInit) 164
02FB: Watchdog Timer Reboot (MemTest) 164
02FC: LDTStop Reboot (HTLinkInit) 165
BIOS detected pattern write/read mismatch in
DIMM slot: 170
BIOS detected uncorrectable ECC error in DIMM
slot: 171
BIOS detected unknown errors in DIMM slot 171
Fatal Error: No DIMM detected and system can not
continue boot! 172
Fatal Error! All channels are disabled! 171
Fatal Error! All DIMM failed and system can not
continue boot! 171
Fatal Error! RDIMMs and UDIMMs are mixed! 172
Fatal Error! UDIMM in 3rd slot is not supported!
172
No message on console 165
No message on the console 173
Software memory test failed! 173
POST error messages, 22xx systems
0200: Failure Fixed Disk 166
0230: System RAM Failed at offset 166
0232: Extended RAM Failed at address line 166

023A: ONTAP Detected Bad DIMM in slot 167


023B: BIOS detected SPD checksum error in DIMM
slot: 167
023E: Node Memory Interleaving disabled 167
0241: SMBus Read Timeout 167
0242: Invalid FRU information 167
0250: System battery is dead - Replace and run
SETUP 168
0251: System CMOS checksum bad 168
0260: System timer error 168
0271: Check date and time settings 168
0280: Previous boot incomplete - Default
configuration used 169
02A2: System Error Log (SEL) Full 169
02A3: No Response From SP To FRU ID Read
Request 169
02C2: No valid Boot Loader in System Flash - Non
Fatal 169
02C3: No valid Boot Loader in System Flash - Fatal

170
BIOS detected pattern write/read mismatch in
DIMM slot: 170
BIOS detected uncorrectable ECC error in DIMM
slot: 171
BIOS detected unknown errors in DIMM slot 171
Fatal Error: No DIMM detected and system can not
continue boot! 172
Fatal Error! All channels are disabled! 171
Fatal Error! All DIMM failed and system can not
continue boot! 171
Fatal Error! RDIMMs and UDIMMs are mixed! 172
No message on the console 173
Software memory test failed! 173
POST error messages, 25xx systems
0230: System RAM Failed at offset 166
0232: Extended RAM Failed at address line 166
023A: ONTAP Detected Bad DIMM in slot 167
023B: BIOS detected SPD checksum error in DIMM
slot: 167
023E: Node Memory Interleaving disabled 167
0241: SMBus Read Timeout 167
0242: Invalid FRU information 167
0250: System battery is dead - Replace and run
SETUP 168
0251: System CMOS checksum bad 168
0260: System timer error 168
0271: Check date and time settings 168
0280: Previous boot incomplete - Default
configuration used 169
02A2: System Error Log (SEL) Full 169

Index | 365
02A3: No Response From SP To FRU ID Read
Request 169
02C2: No valid Boot Loader in System Flash - Non
Fatal 169
02C3: No valid Boot Loader in System Flash - Fatal

170
BIOS detected pattern write/read mismatch in
DIMM slot: 170
BIOS detected uncorrectable ECC error in DIMM
slot: 171
BIOS detected unknown errors in DIMM slot 171
Fatal Error: No DIMM detected and system can not
continue boot! 172
Fatal Error! All channels are disabled! 171
Fatal Error! All DIMM failed and system can not
continue boot! 171
Fatal Error! RDIMMs and UDIMMs are mixed! 172
No message on the console 173
Software memory test failed! 173
POST error messages, 31xx systems
0200: Failure Fixed Disk 157
0230: System RAM Failed at offset: 158
0231: Shadow RAM failed at offset 158
0232: Extended RAM failed at address line 159
0235: Multiple-bit ECC error occurred 159
023C: Bad DIMM found in slot # 159
023E: Node Memory Interleaving disabled 160
0241: Agent Read Timeout 160
0242: Invalid FRU information 161
0250: System battery is dead 161
0251: System CMOS checksum bad 162
0260: System timer error 162
0280: Previous boot incomplete 162
02C2: No valid Boot Loader in System FlashNon
Fatal 163
02C3: No valid Boot Loader in System FlashFatal
163
02FA: Watchdog Timer Reboot (PciInit) 164
02FC: LDTStop Reboot (HTLinkInit) 165
No message on console 165
POST error messages, 32xx and SA320 systems
0200: Failure Fixed Disk 166
0230: System RAM Failed at offset 166
0232: Extended RAM Failed at address line 166
023A: ONTAP Detected Bad DIMM in slot 167
023B: BIOS detected SPD checksum error in DIMM
slot: 167
023E: Node Memory Interleaving disabled 167
0241: SMBus Read Timeout 167
0242: Invalid FRU information 167

0250: System battery is dead - Replace and run


SETUP 168
0251: System CMOS checksum bad 168
0260: System timer error 168
0271: Check date and time settings 168
0280: Previous boot incomplete - Default
configuration used 169
02A2: System Error Log (SEL) Full 169
02A3: No Response From SP To FRU ID Read
Request 169
02C2: No valid Boot Loader in System Flash - Non
Fatal 169
02C3: No valid Boot Loader in System Flash - Fatal

170
BIOS detected pattern write/read mismatch in
DIMM slot: 170
BIOS detected uncorrectable ECC error in DIMM
slot: 171
BIOS detected unknown errors in DIMM slot 171
Fatal Error: No DIMM detected and system can not
continue boot! 172
Fatal Error! All channels are disabled! 171
Fatal Error! All DIMM failed and system can not
continue boot! 171
Fatal Error! RDIMMs and UDIMMs are mixed! 172
Fatal Error! UDIMM in 3rd slot is not supported!
172
No message on the console 173
Software memory test failed! 173
POST error messages, 60xx and SA600 systems
0200: Failure Fixed Disk 157
0230: System RAM Failed at offset: 158
0231: Shadow RAM failed at offset 158
0232: Extended RAM failed at address line 159
0235: Multiple-bit ECC error occurred 159
023C: Bad DIMM found in slot # 159
023E: Node Memory Interleaving disabled 160
0241: Agent Read Timeout 160
0242: Invalid FRU information 161
0250: System battery is dead 161
0251: System CMOS checksum bad 162
0253: Clear CMOS jumper detected 162
0260: System timer error 162
0280: Previous boot incomplete 162
02C2: No valid Boot Loader in System FlashNon
Fatal 163
02C3: No valid Boot Loader in System FlashFatal
163
02FA: Watchdog Timer Reboot (PciInit) 164
02FC: LDTStop Reboot (HTLinkInit) 165

366 | Hardware Platform Monitoring Guide


No message on console 165
POST error messages, 62xx and SA620 systems
0200: Failure Fixed Disk 166
0230: System RAM Failed at offset 166
0232: Extended RAM Failed at address line 166
023A: ONTAP Detected Bad DIMM in slot 167
023B: BIOS detected SPD checksum error in DIMM
slot: 167
023E: Node Memory Interleaving disabled 167
0241: SMBus Read Timeout 167
0242: Invalid FRU information 167
0250: System battery is dead - Replace and run
SETUP 168
0251: System CMOS checksum bad 168
0260: System timer error 168
0271: Check date and time settings 168
0280: Previous boot incomplete - Default
configuration used 169
02A2: System Error Log (SEL) Full 169
02A3: No Response From SP To FRU ID Read
Request 169
02C2: No valid Boot Loader in System Flash - Non
Fatal 169
02C3: No valid Boot Loader in System Flash - Fatal

170
BIOS detected pattern write/read mismatch in
DIMM slot: 170
BIOS detected uncorrectable ECC error in DIMM
slot: 171
BIOS detected unknown errors in DIMM slot 171
Fatal Error: No DIMM detected and system can not
continue boot! 172
Fatal Error! All channels are disabled! 171
Fatal Error! All DIMM failed and system can not
continue boot! 171
Fatal Error! RDIMMs and UDIMMs are mixed! 172
Fatal Error! UDIMM in 3rd slot is not supported!
172
No message on the console 173
Software memory test failed! 173
POST error messages, 80xx series systems
0232: Extended RAM Failed at address line 166
POST error messages, 80xx systems
0200: Failure Fixed Disk 166
0230: System RAM Failed at offset 166
023A: ONTAP Detected Bad DIMM in slot 167
023B: BIOS detected SPD checksum error in DIMM
slot: 167
023E: Node Memory Interleaving disabled 167
0241: SMBus Read Timeout 167

0242: Invalid FRU information 167


0250: System battery is dead - Replace and run
SETUP 168
0251: System CMOS checksum bad 168
0260: System timer error 168
0271: Check date and time settings 168
0280: Previous boot incomplete - Default
configuration used 169
02A2: System Error Log (SEL) Full 169
02A3: No Response From SP To FRU ID Read
Request 169
02C2: No valid Boot Loader in System Flash - Non
Fatal 169
02C3: No valid Boot Loader in System Flash - Fatal

170
BIOS detected pattern write/read mismatch in
DIMM slot: 170
BIOS detected uncorrectable ECC error in DIMM
slot: 171
BIOS detected unknown errors in DIMM slot 171
Fatal Error: No DIMM detected and system can not
continue boot! 172
Fatal Error! All channels are disabled! 171
Fatal Error! All DIMM failed and system can not
continue boot! 171
Fatal Error! RDIMMs and UDIMMs are mixed! 172
Fatal Error! UDIMM in 3rd slot is not supported!
172
No message on the console 173
Software memory test failed! 173
POST error messages, FAS22xx systems
02A1: SP Not Found 169
BIOS detected unknown errors in DIMM slot: 170
No Response to Controller FRU ID Read Request
via IPMI 173
No Response to Midplane FRU ID Read Request via
IPMI 173
SP FRU Entry is Blank or Checksum Error 173
POST error messages, SA300 systems
0200: Failure Fixed Disk 157
0230: System RAM Failed at offset: 158
0231: Shadow RAM failed at offset 158
0232: Extended RAM failed at address line 159
0235: Multiple-bit ECC error occurred 159
023C: Bad DIMM found in slot # 159
023E: Node Memory Interleaving disabled 160
0241: Agent Read Timeout 160
0242: Invalid FRU information 161
0250: System battery is dead 161
0251: System CMOS checksum bad 162

Index | 367
0260: System timer error 162
0280: Previous boot incomplete 162
02C2: No valid Boot Loader in System FlashNon
Fatal 163
02C3: No valid Boot Loader in System FlashFatal

163
02FA: Watchdog Timer Reboot (PciInit) 164
02FC: LDTStop Reboot (HTLinkInit) 165
No message on console 165
PSU LEDs
20xx systems 34
22xx systems 41
25xx systems 53
31xx systems 63
32xx systems 71
60xx systems 76
62xx systems 84
FAS25xx systems 53
location and meaning of 80xx systems 99
location and meaning of FAS80xx systems 99
SA200 systems 34
SA300 system 59
SA320 systems 71
SA600 systems 76
SA620 systems 84

Q
quality documentation
how to send feedback about improving 352

R
RLM
AutoSupport e-mail contents 322
types of messages 321
when AutoSupport messages are sent 321
when RLM EMS messages are sent 322
RLM EMS messages
rlm.firmware.update.failed 327
RLM-generated messages
Heartbeat loss warning 322
Reboot (power loss) critical 323
Reboot (watchdog reset) warning 323
Reboot warning 323
RLM heartbeat loss 323
RLM heartbeat stopped 324
System boot failed (POST failed) 324
User triggered (RLM test) 324
User_triggered (system nmi) 324

User_triggered (system power cycle) 325


User_triggered (system power off) 325
User_triggered (system power on) 325
User_triggered (system reset) 325

S
SA200 systems
controller module fault LED 32
controller module LEDs 30
Ethernet port LEDs 32
fault LED 30
Fibre Channel port LEDs 32
LEDs on the back of the controller module 32
LEDs on the front of the chassis 30
NVMEM LED 32
power LED 30
PSU LEDs 34
remote management port LEDs 32
startup progress, viewing 155
SA300 systems
fan LED 58
FC port LEDs 57
GbE port LEDs 57
introduction to LEDs on 56
introduction to POST error messages 157
LEDs on the back of the controller 57
location and meaning of controller front LEDs 56
RLM LEDs 57
SA320 system POST error messages
0231: Shadow RAM Failed at offset 166
SA320 systems
chassis fault LED 65
controller activity LED 65
controller fault LED 66
controller-I/O expansion module configuration 65
dual-controller configuration 65
fan LED 70
Fibre Channel port LEDs 66
GbE port LEDs 66
I/O expansion module fault LED 69
internal FRU LEDs 72
introduction to POST error messages 166
LED on the back of the I/O expansion module 69
LEDs on the back of the controller 66
LEDs on the front of the chassis 65
management port LEDs 66, 69
NVMEM LED 66
power LED 65
PSU LEDs 71

368 | Hardware Platform Monitoring Guide


SAS port LEDs 66
SA600 system error messages
02F9: FPGA jumper detected 163
SA600 system POST error messages
02FB: Watchdog Timer Reboot (MemTest) 164
SA600 systems
activity LED 73
fan LEDs 75
Fibre Channel port LEDs 74
GbE port LEDs 74
introduction to POST error messages 157
LEDs on the back of the controller 74
LEDs on the front of the controller 73
location and meaning of PSU LEDs 76
power LED 73
RLM LEDs 74
status LED 73
SA620 system POST error messages
0231: Shadow RAM Failed at offset 166
SA620 systems
10-GbE port LEDs 79
8-Gb Fibre Channel port LEDs 79
chassis fault LED 77
controller activity LED 77
controller fault LED 79
controller-I/O expansion module configuration 77
dual-controller configuration 77
fan LED 84
GbE port LEDs 79
I/O expansion module fault LED 83
internal FRU LEDs 85
introduction to POST error messages 166
LEDs on the back of the controller 79
LEDs on the back of the I/O expansion module 83
power LED 77
private management port LEDs 79, 83
PSU LEDs 84
remote management port LEDs 79
serial port 79
USB port 79
SAS EMS messages
ds.sas.config.warning 247
ds.sas.crc.err 248
ds.sas.drivephy.disableErr 248
ds.sas.element.fault 248
ds.sas.element.xport.error 249
ds.sas.hostphy.disableErr 249
ds.sas.invalid.word 250
ds.sas.loss.dword 250
ds.sas.multPhys.disableErr 250

ds.sas.phyRstProb 251
ds.sas.running.disparity 251
ds.sas.ses.disableErr 251
ds.sas.xfer.element.fault 252
ds.sas.xfer.export.error 252
ds.sas.xfer.not.sent 252
ds.sas.xfer.unknown.error 253
sas.adapter.bad 253
sas.adapter.bootarg.option 253
sas.adapter.debug 254
sas.adapter.exception 254
sas.adapter.failed 254
sas.adapter.firmware.down load 254
sas.adapter.firmware.fault 255
sas.adapter.firmware.update.failed 255
sas.adapter.not.ready 255
sas.adapter.offline 256
sas.adapter.offlining 256
sas.adapter.online 256
sas.adapter.online.failed 256
sas.adapter.onlining 257
sas.adapter.reset 257
sas.adapter.unexpected.status 257
sas.cable.error 257
sas.cable.pulled 258
sas.cable.pushed 258
sas.config.mixed.detected 258
sas.device.invalid.wwn 258
sas.device.quiesce 259
sas.device.resetting 259
sas.device.timeout 260
sas.initialization.failed 260
sas.link.error 260
sas.port.disabled 261
sas.port.down 261
sas.shelf.conflict 261
sasmon.adapter.phy.disable 262
sasmon.adapter.phy.event 262
sasmon.disable.module 263
shm.threshold.spareBlocksConsumed 263
shm.threshold.spareBlocksConsumedMax 263
SAS HBA ports
location of quad-port, 3-Gb 140
SAS HBAs
dual-port, 3-Gb SAS HBA ports and cable 139
Service Processor
See SP
SES EMS messages
ses.access.noEnclServ 263
ses.access.noMoreValidPaths 264

Index | 369
ses.access.noShelfSES 265
ses.access.sesUnavailable 265
ses.badShareStorageConfigErr 266
ses.bridge.fw.getFailWarn 266
ses.bridge.fw.mmErr 266
ses.channel.rescanInitiated 267
ses.config.drivePopError 267
ses.config.IllegalEsh270 267
ses.config.shelfMixError 268
ses.config.shelfPopError 268
ses.disk.configOk 268
ses.disk.illegalConfigWarn 268
ses.disk.pctl.timeout 268
ses.download.powerCyclingChannel 269
ses.download.shelfToReboot 269
ses.download.suspendIOForPowerCycle 269
ses.drive.PossShelfAddr 270
ses.drive.shelfAddr.mm 270
ses.exceptionShelfLog 271
ses.extendedShelfLog 271
ses.fw.emptyFile 272
ses.fw.resourceNotAvailable 272
ses.giveback.restartAfter 272
ses.giveback.wait 272
ses.psu.coolingReqError 273
ses.psu.powerReqErrorr 273
ses.remote.configPageError 273
ses.remote.elemDescPageError 274
ses.remote.faultLedError 274
ses.remote.flashLedError 274
ses.remote.shelfListError 274
ses.remote.statPageError 274
ses.shelf.changedID 275
ses.shelf.ctrlFailErr 275
ses.shelf.em.ctrlFailErr 276
ses.shelf.IdBasedAddr 276
ses.shelf.invalNum 276
ses.shelf.mmErr 277
ses.shelf.OSmmErr 277
ses.shelf.powercycle.done 277
ses.shelf.powercycle.start 277
ses.shelf.sameNumReassign 278
ses.shelf.unsupportAllowErr 278
ses.shelf.unsupportedErr 278
ses.startTempOwnership 279
ses.status.ATFCXError 279
ses.status.ATFCXInfo 279
ses.status.currentError 279
ses.status.currentInfo 280
ses.status.currentWarning 280

ses.status.displayError 280
ses.status.displayInfo 281
ses.status.displayWarning 281
ses.status.driveError 281
ses.status.driveOk 282
ses.status.driveWarning 282
ses.status.electronicsError 282
ses.status.electronicsInfo 283
ses.status.electronicsWarn 283
ses.status.ESHPctlStatus 283
ses.status.fanError 283
ses.status.fanInfo 284
ses.status.fanWarning 284
ses.status.ModuleError 284
ses.status.ModuleInfo 284
ses.status.ModuleWarn 285
ses.status.psError 285
ses.status.psInfo 285
ses.status.psWarning 286
ses.status.temperatureError 286
ses.status.temperatureInfo 287
ses.status.temperatureWarning 287
ses.status.upsError 287
ses.status.upsInfo 288
ses.status.volError 288
ses.status.volWarning 288
ses.system.em.mmErr 289
ses.tempOwnershipDone 289
sfu.adapterSuspendIO 289
sfu.auto.update.off.impact 289
sfu.ctrllerElmntsPerShelf 290
sfu.downloadCtrllerBridge 290
sfu.downloadError 290
sfu.downloadingController 290
sfu.downloadingCtrllerR1XX 291
sfu.downloadStarted 291
sfu.downloadSuccess 291
sfu.downloadSummary 291
sfu.downloadSummaryErrors 291
sfu.FCDownloadFailed 292
sfu.firmwareDownrev 292
sfu.firmwareUpToDate 292
sfu.partnerInaccessible 292
sfu.partnerNotResponding 293
sfu.partnerRefusedUpdate 293
sfu.partnerUpdateComplete 293
sfu.partnerUpdateTimeout 294
sfu.rebootRequest 294
sfu.rebootRequestFailure 294
sfu.resumeDiskIO 294

370 | Hardware Platform Monitoring Guide


sfu.SASDownloadFailed 295
sfu.statusCheckFailure 295
sfu.suspendDiskIO 295
sfu.suspendSES 295
SP
AutoSupport e-mail contents 312
EMS messages about the SP 314
SP-generated AutoSupport messages 312
when AutoSupport messages are sent 311
when SP EMS messages are sent 312
SP messages
types available for troubleshooting 311
SP-generated messages
HEARTBEAT_LOSS 312
REBOOT (abnormal) 313
SYSTEM_BOOT_FAILED (post failed) 313
USER_TRIGGERED (sp test) 313
USER_TRIGGERED (system nmi) 313
USER_TRIGGERED (system power cycle) 314
USER_TRIGGERED (system power off) 314
USER_TRIGGERED (system reset) 314
startup error messages
boot messages 155
POST messages 154
types of 154
suggestions
how to send feedback about documentation 352

T
TOE NIC LEDs
location and meaning of dual-port, 10GBase-CX4

152
location and meaning of dual-port, 10GBase-SR 151
quad-port 149
single-port 148
tools
boot_diags command 28
forms and use of diagnostic 28
sldiag commands 28
troubleshooting
information sources for 26

types of SP messages available for 311


where to find platform documentation for 28
Troubleshooting
How AutoSupport messages help with
troubleshooting 27
Where LEDs appear 26
where messages are displayed 26

U
USB boot devices
introduction to EMS messages 296
USB EMS messages
usb.adapter.debug 296
usb.adapter.exception 296
usb.adapter.failed 296
usb.adapter.reset 297
usb.device.failed 297
usb.device.initialize.failed 297
usb.device.maximum.connected 298
usb.device.protocol.mismatch 298
usb.device.removed 299
usb.device.timeout 299
usb.device.unsupported 299
usb.device.unsupported.speed 300
usb.external.device.not.used 300
usb.externalHub.notSupported 300
usb.port.error 300
usb.port.reset 301
usb.port.state.indeterminate 301
usb.port.status.inconsistent 301
usbmon.boot.device.failed 302
usbmon.boot.device.pfa 302
usbmon.disable.module 302
usbmon.unable.to.monitor 303
UTA2 (CNA)
error messages, introduction to 306
UTA2 (CNA) error messages
maintenance mode and 7-Mode 306
UTA2 (CNA) error messages)
for clustered systems 308

You might also like