You are on page 1of 21

Monitoring WebSphere DataPower SOA Appliances

Steve Linn (swlinn@us.ibm.com), Consulting I/T Specialist, IBM Software Services for WebSphere, IBM John Rasmussen (rasmussj@us.ibm.com), Senior I/T Specialist, IBM Software Services for WebSphere, IBM Summary: This article describes the fundamentals of monitoring the health and capacity of WebSphere DataPower SOA Appliances, including why to monitor and best practices for monitoring. Tags for this article: datapower, dp, for, log, me, monitoring, research, ronhill_monitoring,snmp, targets Tag this! Update My dW interests (Log in | What's this?)Skip to help for Update My dW interests

Introduction
The IBM WebSphere DataPower SOA Appliance (hereafter called DataPower) is a purposebuilt hardware platform designed to simplify, secure, and accelerate XML, Web services, and Enterprise Service Bus deployments. As with other network appliances, monitoring the health and capacity of DataPower appliances will ensure that they are ready to perform the functions for which they are configured. Monitoring not only notifies administrators of exceptions, it also provides trending analysis for managing the appliances and their capacity utilization over time, thus enabling the organization to maximize its return-on-investment and receive warnings of increases in network volumes and potential capacity issues. This article describes various DataPower status inquiry methods and presents strategies and best practices for interpreting them. This article is based on DataPower Firmware Revision 3.8.0. Monitoring status providers may change with enhancements to the firmware, so you should check current firmware documentation for any additions to monitoring components.

Why monitor?
The DataPower Appliance family consists of 1U rack-mountable network devices. The latest generation devices (9235/9004 class) contain four gigabit RJ-45 Ethernet interfaces, a DB-9 Serial port, hot swappable power supplies and fan-trays, batteries, eight gigabytes of RAM, compact flashbased file system, and other components within a tamper-proof case. Optional features including internal hard drives, hardened cryptographic modules, and additional compact flash bays. Each of these components helps ensure that the device is properly configured for the amount of network data it receives. Knowing that the devices are functioning properly ensures that they are available and ready to process this traffic. For example, if you are alerted to variations in the performance of the devices fans, you may avoid having to take the device offline for unanticipated service. Understanding the level of network traffic and being aware of incremental changes may avoid bottlenecks as traffic increases over time.

Monitoring fundamentals on DataPower


DataPower provides a variety of information regarding general system health as well as consumption of resources and services. Physical parameters range from the temperature of CPUs, utilization of memory and file system, interface utilization, and voltage reading, among other physical values. In addition, there are more formulaic indicators, such as System Usage, which is a calculation of system capacity. DataPower exposes these status values in a variety of ways. You can use the Web GUI or Command Line Interface (CLI) show commands to browse a list of status values. Or you can use the XML Management Interface (XMI) to send SOAP messages containing dp:get-status requests to the device, which responds with status information contained in SOAP responses. DataPower also supports the Simple Network Management Protocol (SNMP) and acts as an SNMP agent, providing

status information in response to SNMP operations and in the creation of alerts via the SNMP notification mechanism. Figure 1 shows the CPU usage as displayed within the Web GUI. It is obtained by navigating from Status Menu => System => CPU Usage. The data is displayed in a table incrementing from the latest 10 seconds through to the latest 24 hours. Figure 1. Web GUI CPU usage display

The CLI show commands are used to display status information, and Listing 1 shows the show cpu command, which provides the same table of data shown in the Web GUI: Listing 1. CLI Show CPU command
xi50# show cpu cpu usage (%): 10 sec 1 1 min 1 10 min 7 1 hour 7 1 day 7

While the Web GUI and CLI are convenient tools to fetch status information interactively, the XMI can be programmatically integrated into more complex solutions. For example, a Java class could execute a dp:get-status request and perhaps perform configuration modification based on the response. The SOAP request in Listing 2 shows a dp:get-status request to fetch CPU usage status: Listing 2 Sample get status XMI request
<env:Envelope xmlns:env="http://schemas.xmlsoap.org/soap/envelope/"> <env:Body> <dp:request xmlns:dp="http://www.datapower.com/schemas/management"> <dp:get-status class="CPUUsage"/> </dp:request> </env:Body> </env:Envelope>

The response is returned in a SOAP payload, as shown in Listing 3 below. Again, the CPU status is returned within a subtree containing the same table of data returned by the Web GUI and CLI: Listing 3 XMI dp:get-status response

<?xml version="1.0" encoding="UTF-8"?> <env:Envelope xmlns:env="http://schemas.xmlsoap.org/soap/envelope/"> <env:Body> <dp:response xmlns:dp="http://www.datapower.com/schemas/management"> <dp:timestamp>2009-09-24T11:56:22-04:00</dp:timestamp> <dp:status> <CPUUsage xmlns:env="http://www.w3.org/2003/05/soap-envelope"> <tenSeconds>1</tenSeconds> <oneMinute>1</oneMinute> <tenMinutes>1</tenMinutes> <oneHour>1</oneHour> <oneDay>1</oneDay> </CPUUsage> </dp:status> </dp:response> </env:Body> </env:Envelope>

You can get a vast amount of status data using the dp:get-status request. For more information, including details of the schemas and WSDL used to customize dp:get-status and other XMI operations, see the IBM Redpaper WebSphere DataPower SOA Appliances: The XML Management Interface. Most organizations query the health and capacity of a network device using the SNMP protocol in conjunction with tools such as those in the IBM Tivoli Monitoring (ITM) and Tivoli Composite Application Manager (ITCAM) product families. These tools use SNMP over UDP to poll an SNMP agent for device and application metrics. The management software may also receive notification alerts from the agent in response to particular events happening on the device. The DataPower appliance may be configured to act as a SNMP agent, responding to inbound polling requests and sending alerts in response to preconfigured events. SNMP status variables are organized in hierarchies, which are described by the Management Information Base (MIB) document. Each metric that can be polled is addressed by an Object Identifier (OID). Some metrics are scalar objects describing a single data point, such as the current firmware version on the appliance. Other metrics may be tabular, such as the CPU status provided in the previous examples. When a specific OID is known, a GET OID can be used by the SNMP manager to get the specific metric. If all metrics in a specific hierarchy are desired, a Get Subtree can be used to get all values within that hierarchy. The DataPower appliance provides three Enterprise MIB documents for configuration, status, and notification. It is the status MIB that we are interested in. While status inquiry is a straightforward endeavor, alerting is done using several DataPower objects. The device has four built-in notification alerts: authenticationFailure, linkDown, coldStart, and linkUp. Others are preconfigured, as described below. A properly configured SNMP monitor receives these traps in the event that the device restarts, its interfaces become enabled or disabled, or when a failed attempt to access the device occurs. In addition to the built-in alerts, custom alerts may be generated by subscribing to a list of error conditions or in conjunction with the logging system. Reliance on alerts alone is not a sufficient monitoring strategy. For example, if the event causing the alert affects the devices ability to send the message over the network, the notification may not be received at the SNMP monitor. Therefore it is prudent to combine subscription to alert messages with polling of status information, to provide a robust mechanism for communicating with monitoring tools.

How to monitor
Many status providers (or monitoring agents) are built into the DataPower firmware to fetch status data. Many providers are specific to the device. These providers (such as the environmental components, fans, temperatures, or battery health), are available within the default domain and are always enabled. Other status data (such as transaction rates for DataPower services) are segmented by application domain and may be further segmented by XML-Manager or DataPower service.

While the device-level data is automatically enabled, transaction data such as transaction rates or transaction times is usually available only when Statistics are enabled on the device. There are exceptions to this generalization -- for example, CPU status requires statistic enablement, while System Load does not. Each domain must have its individual Statistics setting enabled to provide domain-specific status. This section shows you how to enable monitoring of DataPower from SNMP tools, and how to produce SNMP alerts from within DataPower. You'll see how Logging Target configuration can be configured to produce alerts based on system events, and how to subscribe to events such as Out of Memory or Power Supply failure to generate alerts. An example of a Power Supply failure will be used to demonstrate these principles. SNMP settings must first be configured on the DataPower appliance. This configuration is accessible from the default domain and accessed from the left navigation menu of the DataPower Web GUI by first selecting the Administration menu and then selecting SNMP Settings under the Access heading. This configuration consists of multiple tabs. The main tab must have the Admin State set to Enabled. Typically, the Local IP Address is set to a Host Alias defined in the default domain that maps to the Management Interface IP, which restricts SNMP polling requests to this IP and not any of the client traffic interfaces (eth0, eth1, or eth2). Figure 2 shows SNMP settings enabled on the default Local Port of 161. Outbound polling responses and traps will be sent out using any appliance interface that has the correct routing. To restrict this outbound traffic to the same IP, add a static route to the appliance's mgt0 configuration. Figure 2. Enabling SNMP settings

The DataPower MIBs can be downloaded from the appliance to be used by any SNMP management tool. The MIBs enable these tools to translate named objects such as dpStatusMemoryStatusUsage to an OID used to request the metric. All appliance status OIDs are in the drStatusMIB.txt MIB file. Figure 3 shows the Enterprise MIBs tab of the SNMP Settings screen, and the method for downloading the MIBs:

Figure 3. SNMP MIB download tab

The Trap Event Subscription tab contains a list of event codes that can be sent to the management software as an alert. Examples are the codes for "Internal cooling fan has stopped" or "Power supply failure." Figure 4 below shows some of the default preloaded subscriptions. To add additional events, click Select Code. If a specific code is not shown in the list, you can add it manually. For example, adding code 0x806000e2 adds certificate monitor events to indicate when a certificate is nearing expiration. You can get these event codes from their associated log records in the default log. You can also get the event code in the Message Reference document for your firmware release. Figure 4. SNMP Trap Event Subscription

The SNMPV1/V2c Communities tab defines access policies for management software using SNMP V1 and V2. The community name is used as a credential to access the SNMP data on the appliance. A common community name for read-only access is public. A DataPower domain, either the default or an application domain, can be associated with the configured community. If application data is to be polled, specify the application domain; otherwise use the default domain. Specifying an application domain does not prevent management software from polling device-level metrics such as device load, CPU utilization, memory metrics, and environmental statistics. Additionally, it allows polling of application metrics such as transaction rates and times, MQ queue manager status, message counters, or SLM metrics. The mode of the community should be configured as read-only for access to appliance status metrics. Finally, a remote host access of 0.0.0.0/0 lets any SNMP manager access this community. It can be restricted to a range of IPs if desired. To configure additional communities, click Add.

Figure 5 shows the specification of an SNMP V1/V2c community name of public for the read-only access of application domain status within the swlinn-poc domain. Figure 5. SNMP Community Settings specifying and application domain

The Trap and Notification Targets tab lets you specify the IP and port of the SNMP manager that will receive SNMP alerts and notifications. The default is UDP port 162. The community name and the SNMP version (1, 2c, or 3) must be specified. If Version 3 is used, a DataPower user name is provided in the Security Name field. This user will be configured with SNMP V3 credentials. The specific events that are alerted are configured on either the SNMP Trap Event Subscription tab or on the subscription configuration of a SNMP logging target. Events preconfigured by default on the SNMP Trap Event Subscription tab are critical device-specific events, such as memory exhaustion, or hardware issues with the power supplies, battery, or fans. To configure additional notification targets, click Add. Figure 6 shows the configuration of the recipient of SNMP alerts using SNMP Version 2c with the community name of public: Figure 6. SNMP Trap and Notification Targets

Finally, the SNMPV3 Contexts tab gives SNMPV3 managers access to non-default application domains. To allow only SNMP polling, enabling the SNMP settings and providing a SNMPV1/V2c community is all that is required. Trap and notification targets and event subscriptions are required in sending event alerts to an SNMP manager. As previously mentioned, some status data such as fan speeds and CPU utilization is specific to the device. Other status data such as transaction rates are segmented by application domain and are accumulated only if the statistics setting configuration is enabled, as shown in Figure 7 below. Enabling statistics has a very small impact on system utilization. Adjusting the Load Interval (the frequency of SNMP polling) will further limit this impact. Figure 7. Statistics enabled per domain

Here is an example of a poll of an appliance metric: An SNMP manager issues a SNMP GET command for the dpStatusMemoryStatusUsage metric, which returns a scalar value of the percentage of memory being utilized. Many SNMP managers, when configured with the DataPower MIBs, provide a tree hierarchy of the status MIB from which the appropriate metric can be selected, the metric polled, and the value displayed. Application monitoring can also be polled if the application domain is specified in the DataPower SNMP configuration. Depending on the application configuration, specific metrics can be polled to provide data on the health or throughput of the application. These application-related table entries differ from system-level metrics in that they are dynamic and are based on the key fields of these tables. For an example of a poll of an application metric, consider the dpStatusHTTPTransactions2Table table, which contains the transaction rates for all services in a domain over various time intervals. Metrics in this table are based upon the service class, such as XMLFirewallService, and the service name, such as Loopback_FW. In addition to the event subscriptions that you can specify in the SNMP settings, you can also configure a DataPower logging target to produce SNMP logging events, which enables DataPower to send SNMP alerts for specific events of interest. SelectManage Logging Targets from the left navigation of the DataPower Web GUI from the Administration menu under the Miscellaneous heading. Click Add to create a new logging target, and specify Target Type to be SNMP. Figure 8 shows a log target with an SNMP Target Type:

Figure 8. SNMP logging target

The SNMP logging target can subscribe to and filter events just like any other DataPower logging target. The SNMP configuration's list of trap and notification event codes specifies most critical events. An SNMP logging target in the default domain that subscribed to all events with a severity of critical or above is a similar way to produce these alerts. However, the logging target subscriptions in an application domain are more application specific. For example, you can specify logs with an MQ or SSL log category at the error or above level. You can also specify log messages generated by custom stylesheets using custom log categories. Figure 9 shows the subscription of all critical events for this SNMP type log target: Figure 9. Logging Target Subscriptions

Now that the steps to configure and enable SNMP alerts have been described, here is a demonstration of a power supply alert. With the above configuration, the plug from one of two power supplies is pulled. Figure 10 shows log entries associated with a power supply failure:

Figure 10. System Log Entries

The SNMP configuration specified no restrictions on the SNMP Managers that could receive alerts from this appliance's public community. Any SNMP manager listening for alerts from this appliance on Port 162 will receive a trap for the power failure event. This section has shown you how to configure DataPower to enable monitoring of both appliance and application metrics from SNMP tools, and how to produce SNMP alerts from within an appliance. A logging target configuration was configured to produce alerts based on logging events. SNMP configuration was configured to produce alerts by subscribing to systems events (such as "Out of memory" or "Power supply has failed") as well as an application event (an SSL certificate expiration warning). Enabling statistics for application-level metrics was also shown. A poll of the memory metrics was shown to demonstrate monitoring of device metrics, and a poll of the transaction rate table was shown to demonstrate monitoring of application-specific metrics. Finally, an example of a power supply failure was used to demonstrate SNMP alerting.

What to monitor
Monitoring accomplishes multiple goals. The general health of the device and of its various physical components can be ascertained by environmental status information such as temperatures, fan speeds, and the status of batteries and power supplies. System load can be gauged by a special status value known as System Usage, in addition to more familiar measurements such as CPU, memory, and file system utilization. The amount of data being processed by the device can be determined by analyzing network interface consumption. The following section discusses several informative status values. Each section shows how to determine the data from the Web GUI, the element from the XMI response, the CLI command to execute to show the status, and the object from the SNMP Enterprise MIB that contains the value.

General device health and activity monitors


General health and activity monitors ensure that the DataPower device is operating within predefined system parameters. You can analyze system capacity via system load and CPU utilization. You can evaluate uptime to ensure that the device has not experienced an unexpected restart. Fans and temperatures are checked to avoid overheating, which can take a device out of service. The following monitors are involved in these tasks:

System usage
Web GUI CLI System => System Usage Show Load XMI Status MIB SystemUsage/Load dpStatusSystemUsageLoad

System Usage is a measurement of the devices ability to accept additional work. It is a formulaic calculation based on various components of system load. System Usage is typically considered the best single indicator of overall system capacity. While it may sometimes spike to 100%, typical values are less than 75%. The secondary work list value is a calculation of queued tasks, and is of lesser interest in typical monitoring situations.

Figure 11. System Usage Status

CPU Usage
Web GUI CLI System => CPU Usage Show cpu XMI Status MIB CPUUsage dpStatusCPUUsage

CPU Usage statistics are provided over five time intervals. Many customers are accustomed to monitoring CPU utilization, but this metric in DataPower is not as reliable as System Usage in determining device capacity. DataPower is self-optimizing, and spikes in CPU unassociated with traffic levels may occur as the device performs background activities. CPU usage may sometimes spike all the way up to 100%, but this level is not necessarily a concern unless it is sustained over numerous consecutive polls. Figure 12. CPU Usage Status

Memory usage
Web GUI System => System => Memory Usage XMI MemoryStatus

CLI

Show memory

Status MIB

dpStatusMemoryStatus

Memory Usage statistics are provided for various classifications of the appliances flash memory. Statistics include a percentage of total memory utilized; bytes of total, used, and free memory; and of lesser interest in typical monitoring, request, XG4, and held memory. The percentage of used memory depends on the application, the size of request and response messages, and the volume and latency of requests. Typical utilization runs less than 80%, and statistics beyond this threshold are of concern. You can use the devices Throttle Settings to temporarily slow down request processing or to perform a warm restart, which recaptures memory in this situation. The following system error codes are associated with these sensors and can be used to trigger alerts from the SNMP Trap Event Subscription configuration:
0x01a40001 Throttling connections due to low memory 0x01a30002 Restart due to low memory 0x01a30003 Memory usage recovered above threshold

Figure 13. Memory Usage Status

File system information


Web GUI System => System => File system Information XMI CLI Show Filesystem FilesystemStatus

Status MIB dpStatusFilesystemStatus

File system statistics are provided for free and total space of the encrypted, temporary, and internal file systems. Monitor all free space metrics -- levels below 20% of the total space are a concern. You can use the devices Throttle Settings to temporarily slow down request processing or to perform a warm restart, which recaptures file system space in situations of reduced free space. The following system error codes are associated with these sensors and can be used to trigger alerts from the SNMP Trap Event Subscription configuration:
0x01a40005 Throttling connections due to low temporary file space 0x01a30006 Restart due to low temporary file space 0x01a50007 Temporary file space recovered above threshold

Figure 14. File system Usage Status

System up time
Web GUI CLI Main => Date and Time DateTimeStatus/ uptime XMI Status MIB DateTimeStatus/uptime dpStatusDateTimeStatusuptime

System up time indicates the elapsed time since the device was last restarted, including controlled firmware reloads as well as any unexpected device restarts. The DataPower device restarts itself automatically in conjunction with throttle configurations such as memory or file system constraints. While you can use SNMP notification for alerting, monitoring uptime via polling ensures that any notification delivery failure will not obscure these events.

Figure 15. Date and time status

Temperature sensors
Web GUI CLI System => Temperature Sensors Show Sensors-Temperature XMI TemperatureSensors/{various name values} dpStatusTemperatureSensorsTable

Status MIB

Various temperature readings are available for CPUs, Memory, and System. Each has a warning and danger temperature associated with it and a status value of OK or FAIL. Monitoring the status ensures that the device is operating within the specified range. Investigate temperatures outside the ranges by checking fan speeds, airflow around device, and if necessary by contacting DataPower Support.

Figure 16. Temperature sensors status

Fan sensors
Web GUI CLI System => Fan Sensors Show Sensors-Fan XMI EnvironmentalFanSensors/{various fan-id values} dpStatusEnvironmentalFanSensorsTable

Status MIB

Proper functioning of the devices fans is vital for proper operation. There are two hot swappable fan trays. If the device contains the optional hard disk drives, it will have two additional fans. Each value is associated with a minimum range and a status indicator. Monitoring the status value will ensure proper functioning of the fans. The following system error codes are associated with these sensors and can be used to trigger alerts from the SNMP Trap Event Subscription configuration:
0x02240002 Internal cooling fan has slowed 0x02220003 Internal cooling fan has stopped

Figure 17. Fan sensors status

Other sensors
Web GUI System => Other Sensors XMI CLI Show Sensors-Other EthernetInterfaceStatus/{various name values}

Status MIB dpStatusOtherSensorsTable

There are several additional sensors grouped into the Other classification, including battery, hard disk, and power supply indicators. The intrusion detection sensor is also in this list, and it is triggered when tampering of the physical device is detected. All of these variables include a status value. Monitoring the status value will ensure proper functioning of the fans and other components. The following system error codes are associated with these sensors and can be used to trigger alerts from the SNMP Trap Event Subscription configuration:
0x02220001 Power supply failure 0x02220004 System battery missing 0x02220005 System battery failed

Replace the battery every two years -- critical level log records will begin to appear before that.

Figure 18. Other sensors status

Interface utilization statistics


Interface utilization monitors provide an analysis of the amount of data that is being received and transmitted by the DataPower device. Each device contains four gigabit interfaces. Monitoring this utilization can help you understand your transmission rates and how they change over time. Knowing that a service is increasing 10% per month can be used to anticipate additional support resources such as DataPower or backend devices.

Ethernet interfaces
Web GUI CLI System => Ethernet Interfaces Show Ethernet XMI EthernetInterfaceStatus/{various name values} dpStatusEthernetInterfaceStatusTable

Status MIB

Figure 19. Ethernet interface status

Receive and transmit throughput


Web GUI CLI IP-Network => RX Throughput Show receive-kbps XMI ReceiveKbpsThroughput/{various time values} dpStatusReceiveKbpsThroughputTable

Status MIB

Web GUI CLI

IP-Network => TX Throughput Show transmit-kbps

XMI

TransmitKbpsThroughput/{various time values} dpStatusTransmitKbpsThroughputTable

Status MIB

Receive and transmit throughput information can help you understand the amount of data being processed by the device. These statistics are provided for five time values ranging from 10 seconds up to the most recent 24 hour period. This data point is an important one to capture in order to understand the network load that is being applied to the device. It includes management traffic. If you have not segregated management traffic such as Web GUI, CLI, and XMI to a separate interface, then this data will be included with any application traffic. Each DataPower configuration (or application if you prefer) will vary significantly in terms of the processing done on individual messages. In some instances, small messages may trigger significant processing, perhaps requesting additional data from off box endpoints, performing processor intensive cryptographic operations, or in some other way generating significant system load. In another instance, large messages may be simply routed and require less processing. While there is no hard and fast rule, over time, observations of increases in data will correspond to increases in utilization of DataPower resources. Knowing this information before bottlenecks occur and alleviating it with additional DataPower devices can help you avoid system interruptions. Figure 20. Rx throughput status

HTTP Connections
Web Connection => HTTP Connection XMI EthernetInterfaceStatus/{various name

GUI CLI

Statistics Show http connection Status MIB

values} HTTPConnections

HTTP connections are produced at the domain level. Statistics must be enabled for each domain that is to produce HTTP connection data. One peculiarity is that HTTP connection data is not accumulated for services in loopback mode. The status data is segmented by XML-Manager and contains information about HTTP connections, such as request and reuse. This data can help you understand the level of connections and can be used to judge utilization growth over time. Figure 21. HTTP connections status

Transaction rate and time


Transaction rates and elapsed times for individual services are accumulated at the domain and within domain service level. Transaction rate and time are not provided unless statistics are enabled for each domain. This data can help you understand the number of transactions processed and the average response time of those transactions for a particular service over a number of time intervals.
Web GUI CLI Connection => Transaction Rate Show http XMI HTTPTransactions /{various time values} dpStatusHTTPTransactionsTable

Status MIB

Web GUI CLI

Connection => Traction Time Show http

XMI

HTTPMeanTransactionTime/{various time values} dpStatusHTTPMeanTransactionTimeTable

Status MIB

Figure 22. Transaction rate status

Other network status providers


DataPower supports many protocols beyond the HTTP examples discussed so far, including support for FTP, IMS, MQ, NFS, NTP, SQL, Tibco, and WebSphere JMS. Each of these protocols is represented by status providers, and as in the case of the previous examples, each is supported by the Web GUI, CLI, XMI, and SNMP. Individual configurations may not use any of these additional protocols, and few will use all of them. However, in a configuration that is using one or more of these protocols, monitoring the related status provider is prudent.

Best practices
Successful monitoring of the DataPower appliance will utilize active and proactive inquiry of status information. Configuration of SNMP tools will require listening for traps sent by the device and periodic polling of the device for MIB status data. These actions require a combination of DataPower SNMP Trap Event Subscription configuration and configuration of the SNMP monitoring agent in polling and potentially based on returned status values. In addition to device monitoring, application monitoring is also a useful practice. In this instance sample messages may be sent from robotic clients through the DataPower service to ensure that all network links (including load balancers) are operational. In some instances, this effort is extended to include sending messages through to backend service provider applications to ensure that both frontside and backside links are in service. Both DataPower and backside resources must be configured to respond appropriately to these test messages. The DataPower SMMP trap subscription capability is a useful method of leveraging SNMP notification of events within DataPower. Here is a suggested list of error codes to subscribe to. In the event that the error is produced, the SNMP agent on DataPower will send an Alert/Trap to the SNMP monitor. Suggested error code subscription
0x02220001 0x02240002 0x02220003 0x02220004 environmental environmental environmental environmental critical warning critical critical Power supply failure. Internal cooling fan has slowed Internal cooling fan has stopped. System battery missing.

0x02220005 0x00330002 0x01a40001 0x01a30002 0x01a30003 0x01a50004 0x01a50005 0x01a30006 0x01a50007 0x01a40008 0x01a30009 0x01a3000b 0x01a3000c 0x01a3000d 0x01a2000e 0x01a30011 0x01a30012 0x01a30013 0x01a30014 0x01a30015 0x01a10016 0x01a30017

environmental mgmt system system system system system system system system system system system system system system system system system system system system

critical error warning error error notice warning error notice warning error error error error critical error error error error error alert error

System battery failed. Memory full Throttling connections due to low memory Restart due to low memory Restart due to resource shortage timeout Memory usage recovered above threshold Throttling connections due to low temporary file space Restart due to low temporary file space Temporary file space recovered above threshold Throttling connections due to low number of free ports Restart due to port shortage Restart due to prefix qcode shortage Restart due to namespace qcode shortage Restart due to local qcode shortage Installed battery is nearing end of life Invalid virtual file system File not found Buffer too small I/O error Out of memory Number of free qcodes is very low Restart due to low file descriptor

0x01a40018

system

warning

Throttling due to low number of available file descriptors

MIB status values to monitor It is recommended that SNMP monitors be configured to fetch and report on the following conditions:
dpStatusSystemUsageLoad dpStatusCPUUsagetenMinutes dpStatusFilesystemStatusFreeTemporary >80% for interval of 10 minutes or more >90% (10 minute interval) <20%, maybe unnecessary due to error code subscription <20%, maybe unnecessary due to error code subscription <20%, maybe unnecessary due to error code subscription <20%, maybe unnecessary due to error code subscription

dpStatusFilesystemStatusFreeUnencrypted

dpStatusFilesystemStatusFreeEncrypted

dpStatusMemoryStatusFreeMemory

dpStatusTemperatureSensorsReadingStatus Various temperature sensor readings (table) dpStatusEthernetInterfaceStatusStatus For configured interfaces

MIB status values to monitor for interface utilization In addition to polling and inquiring of data, it is important to ascertain the normal traffic patterns of applications over time. The best way to do this is to capture and monitor the amount of network traffic that the device is processing. The transmit and receive values below will help you predict when devices will become saturated with traffic. Knowing this ahead of time can help you avoid service disruptions.
dpStatusNetworkTransmitDataThroughputTenMinutesBits dpStatusNetworkReceiveDataThroughputTenMinutesBits Capture values over extended time Capture values over extended time

Conclusion

You might also like