Professional Documents
Culture Documents
Configuring Performance Monitoring and Thresholds in Brocade Data Center Fabric Manager (DCFM)
Brocade DCFM supports two types of performance monitoring: historical and real-time. This paper offers best practices for performance monitoring and setting thresholds.
DATA CENTER
CONFIGURATION GUIDE
CONTENTS
Introduction........................................................................................................................................................................................................................................ 3 Historical Performance Data Collection.................................................................................................................................................................................. 3 Enabling Historical Data Collection............................................................................................................................... 4 Real-time Performance Data Collection ....................................................................................................................... 6 Performance Reports .................................................................................................................................................................................................................... 8 Using Historical Performance Data........................................................................................................................................................................................... 9 End-to-End Monitors ....................................................................................................................................................10 Top Talkers...................................................................................................................................................................11 Threshold Events Alerts...............................................................................................................................................................................................................11 About Fabric Watch and Threshold Alerts ..................................................................................................................12 Fabric Class Areas .......................................................................................................................................................13 E_Port Class Areas.......................................................................................................................................................13 F_Port Class Areas.......................................................................................................................................................13 Setting a Fabric-Wide Threshold Policy.......................................................................................................................14 Setting Both a Fabric-Wide and a Device Threshold Policy .......................................................................................17 Creating a New Threshold Policy Quickly....................................................................................................................21
Page 2 of 22
DATA CENTER
CONFIGURATION GUIDE
INTRODUCTION
Brocade Data Center Fabric Manager (DCFM) provides real-time and historical performance monitoring to enable proactive problem diagnosis, maximize resource utilization, and facilitate capacity planning. There are two types of performance monitoring in DCFM: Historical data collection Real-time data collection
Another application, Brocade Fabric Watch, can be configured to monitor and report very specific userspecified conditions. Once configured, Fabric Watch posts alerts to DCFM and/or SNMP trap receivers for further processing and alerts. Threshold event alerts can be easily configured in Brocade DCFM. This paper provides best practices for setting threshold limits to monitor the health of a component or behavior by sampling the status on a switch running Fabric Watch and comparing the sample data to set thresholds. TIP: Before starting performance monitoring, make sure that your SNMP setup is accurate by checking the Brocade DCFM Enterprise Users Manual in the Performance Management Requirements section.
Page 3 of 22
DATA CENTER
CONFIGURATION GUIDE
Once you make your selection, in this case, Enable Selected Fabrics, the Historical Data Collection dialog box is displayed.
2.
In this example, Geneva VF2 is selected. You can change this any time by going back to Monitor > Performance > Historical Data Collection.
Page 4 of 22
DATA CENTER
CONFIGURATION GUIDE
3.
Once historical data collection is enabled, to show the traffic choose Monitor > Performance > View Utilization. Now the topology view displays some of the traffic in red (traffic that exceeds 80% utilization) between the Brocade DCX Backbone and the storage, which may be an issue.
4.
To customize the display of connection utilization, click change in the legend panel.
Page 5 of 22
DATA CENTER
CONFIGURATION GUIDE
2.
Page 6 of 22
DATA CENTER
CONFIGURATION GUIDE
3.
Confirm, and the following real-time graph for theses ports displays.
The real-time graph will start based on a granularity of 10 seconds to 1 minute. You can add other measures by choosing from the Measures drop-down menu and clicking Select at the top left. You can also monitor other parameters such as receive and transmits MBs, CRC errors signal/sync losses, and link failures.
Page 7 of 22
DATA CENTER
CONFIGURATION GUIDE
PERFORMANCE REPORTS
1. Rather than looking at a real-time graph, you can easily generate performance reports from the main Monitor menu or by right-clicking on a switch. A sample report for this switch displays.
2.
From the same menu, you can manage, delete, and export reports to PDF, HTML, or XML. You can generate different reports, from the top 5 to all, FC ports only, device ports only, ISL only, 10GE, or custom. You can also change granularity or the metrics used in the report.
Page 8 of 22
DATA CENTER
CONFIGURATION GUIDE
3.
You can select report duration from the last hour to the last week or customize it by entering your own time frame.
Page 9 of 22
DATA CENTER
CONFIGURATION GUIDE
End-to-End Monitors
You might need to closely monitor end-to-end traffic for performance or troubleshooting reasons. This feature allows you to provision end-to-end monitors of selected servers and storages ports. You can use them for both real-time and historical performance data. NOTE: Note that a) an Advanced Performance Monitoring (APM) license is needed on either the initiator or the target switch and b) that this feature is not available if Top Talkers is already in use. If this is the case, disable Top Talkers. 1. From the End to End Monitor dialog box, choose a target and an initiator to define a monitored pair. In the example bellow, a VMware ESX Server is the initiator and an HP storage array is the target.
2. 3.
Click Apply to apply the settings. Then choose either real- time or historical graph in the same End-to-End (E2E) Monitor dialog box. A graph with traffic information from source to destination displays.
The E2E monitor remains in the DCFM configuration until it is removed from the E2E Monitor dialog box. NOTE: Only E2E monitors created in DCFM are shown in DCFM; E2E monitors created in the CLI are not shown.
Page 10 of 22
DATA CENTER
CONFIGURATION GUIDE
Top Talkers
When you run into performance issues, identifying the top talkers in the fabric can really speed up performance troubleshooting. Top Talker is a feature in the APM (Advanced Performance Monitoring) license, so verify that you have this license installed and that you are running Fabric OS (FOS) 6.2 or later. The Top Talker features displays the connections that are using the most bandwidth on devices or ports. Unlike E2E Monitors, you can use it only in real time. Use a Top Talker monitor to display the connections using the most bandwidth on the selected device or port. Top Talkers can be enabled on the device (Fabric mode) or on one of the F_Ports on the device (F_Port mode). You can have multiple Top Talker monitors configured at the same time on up to 10 switches in Fabric mode and 32 ports and 10 switches in F_Port mode; however, you can monitor only one device or port for each Top Talker you configure. Choose Monitor > Performance > Top Talker or right-click a switch, in the example bellow, a Brocade DCX. A list of the current top talkers on this switch displays. You can also configure Top Talker for an F_Port, but not both fabric and F_Port at the same time.
From the same window, you can select other switches or ports, choose how many top talkers you want to display, and tune the refresh interval.
DATA CENTER
CONFIGURATION GUIDE
The basic function of threshold limits is to monitor the health of a component or behavior by sampling the status and comparing the sample dataand if it exceeds the threshold limits, to notify you via one or more specified methods. When setting threshold limits, you want to know what events to specify and what threshold to set. For example, if you set the threshold for a particular critical event to 100%, by the time your are notified, it may be too late to prevent a failure. But when you set the threshold to 85%, for example, you may be able to prevent the failure from occurring.
Use the low value for SFP RX power rating and fan speed. The power rating alert is very hard to implement because different SFPs have different power ratings. Use in-between value for the performance counters, to be alerted when performance got back to normal. In Brocade Fabric Watch there are counters and measured values. Counters only go up to a specified maximum. Measured values can be random. If you are setting a threshold on a counter, you must specify a baseline period of time, TimeBase, (in seconds, minutes, hours, or days). The alarm trigger is reset after the specified period of time. Otherwise an event is triggered once and only once, which is not what you usually want. You can specify how you are alerted, most commonly used are: Error Log. Output is sent to DCFM and captured in the Master Log, You can use event policies in DCFM to take further action on these events. Error Log. Output is sent to the syslog server when you point the switch to a syslog server. The syslog output can be further processed by using a script that can pass alerts to central monitoring systems such as IBM Tivoli TPC, HP Overview, or BMC Patrol. Post processing could include changing priority, blocking some messages, and adding more information to events . Snmptrap. Traps can be sent to central monitoring systems such as IBM Tivoli TPC, HP Overview, or BMC Patrol. Traps are not sent in user-friendly format and are more difficult to process than the syslog method. E-mail. E-mail alerts are very tedious to set up.
Page 12 of 22
DATA CENTER
CONFIGURATION GUIDE
As a best practice, here are examples of customized settings that deviate from default settings:. Enabling class areas should be done in the same order as the following sections.
* Currently available only in FOS 6.3 on 8 Gbps devices (Brocade 300, 5100, 5300, DCX, DCX-4S)
NOTE: Brocade Fabric Watch messages are documented in the Brocade Fabric OS Message Reference in Chapter 42, FW System Messages.
Page 13 of 22
DATA CENTER
CONFIGURATION GUIDE
2.
Click the Add button to display the New Threshold Policy dialog box.
3. 4. 5.
Enter a name and description for the new threshold policy. From the Policy Type drop-down menu, choose the correct port type. From the Measure drop-down menu, choose which type of utilization you want, either Tx or Rx but not both.
Page 14 of 22
DATA CENTER
CONFIGURATION GUIDE
6.
Next enter the high and low boundaries, and then in the Buffer Size enter a percentage value and click the right arrow.
7.
DCFM may prompt you to change the Buffer Size value, as it does in this example. So the high from the low is 70, divided by 2 equals 35. Enter a new value and click the right arrow button.
Page 15 of 22
DATA CENTER
CONFIGURATION GUIDE
8.
To apply the new policy (Test-Threshold in this example), select the 7800_fabric and click the right arrow. If a Fabric Watch license is not installed on any of the switches in the fabric, the following message displays to prompt you to install the license.
Since no Fabric Watch license is installed in this example, click OK and move on. 9. The Confirm Threshold Changes dialog box displays. Make a selection and click OK.
Page 16 of 22
DATA CENTER
CONFIGURATION GUIDE
10. In the main window, and look at the Master Log messages.
If you look at the first two lines, the test failed because the switches did not have a Fabric Watch license installed. Even if you were to install Fabric Watch licenses on these switches, the same set of error messages would appear if you tried to apply the threshold alert at the switch level. 11. Go back to the Set Threshold Policies dialog box and look at the following;
If you are planning to add switches and would like to have your fabric-wide policies applied to new switches added to the fabric, check Assign fabric-level policies to new switches. Note that this will apply to all fabric that have polices set at the fabric level, so if you dont want to clobber existing switch-level policies, leave this unselected.
2.
In the Set Threshold Policies dialog box, select the E_Port threshold policy, select the fabric for the policy, and click the right arrow button.
Page 17 of 22
DATA CENTER
CONFIGURATION GUIDE
3.
In the Policy Group list, the threshold policy called test appears. Click the Apply button.
4.
Since this is a new fabric, the Confirm Threshold Changes dialog box displays. select the second option and click OK.
A pop-up about applying the new policy displays (briefly). Once is disappears, the three green check marks next to the switches in the Fabrics and Products list, indicate that the fabric-level E_Port policy has been applied.
Page 18 of 22
DATA CENTER
CONFIGURATION GUIDE
2. Click the Apply button and the Confirm Threshold Changes displays. Accept the default selection and click OK.
In this example, the first option is selected by default and the default is accepted. If the second option were selected, the fabric-level policies applied earlier would be overwritten. Victoria to add note about alert about overwriting existing policies. A pop-up about applying the new policy displays (briefly).
Page 19 of 22
DATA CENTER
CONFIGURATION GUIDE
In the Assigned Threshold Policies list, green check marks indicate that the fabric-wide and switch-level policies have been applied.
In this example, the option at the bottom is not selected and fabric-level policies will not be applied to all new switches.
Page 20 of 22
DATA CENTER
CONFIGURATION GUIDE
2.
With the duplicate policy selected, click the Edit button to display the Edit Threshold Policy dialog box.
Page 21 of 22
DATA CENTER
CONFIGURATION GUIDE
3.
Give the duplicate policy a new name, change the measure to Rx, and click OK
4.
NOTE: At the fabric level, you can apply only one fabric wide policy. At the switch level, you can configure two threshold policies.
2010 Brocade Communications Systems, Inc. All Rights Reserved. 02/10 GA-CG-247-00 Brocade, the B-wing symbol, BigIron, DCX, Fabric OS, FastIron, IronView, NetIron, SAN Health, ServerIron, and TurboIron are registered trademarks, and Brocade Assurance, DCFM, Extraordinary Networks, and Brocade NET Health are trademarks of Brocade Communications Systems, Inc., in the United States and/or in other countries. Other brands, products, or service names mentioned are or may be trademarks or service marks of their respective owners. Notice: This document is for informational purposes only and does not set forth any warranty, expressed or implied, concerning any equipment, equipment feature, or service offered or to be offered by Brocade. Brocade reserves the right to make changes to this document at any time, without notice, and assumes no responsibility for its use. This informational document describes features that may not be currently available. Contact a Brocade sales office for information on feature and product availability. Export of technical data contained in this document may require an export license from the United States government.
Page 22 of 22