Professional Documents
Culture Documents
Revision C
TRADEMARK ATTRIBUTIONS
Intel and the Intel logo are registered trademarks of the Intel Corporation in the US and/or other countries. McAfee and the McAfee logo, McAfee Active
Protection, McAfee DeepSAFE, ePolicy Orchestrator, McAfee ePO, McAfee EMM, McAfee Evader, Foundscore, Foundstone, Global Threat Intelligence,
McAfee LiveSafe, Policy Lab, McAfee QuickClean, Safe Eyes, McAfee SECURE, McAfee Shredder, SiteAdvisor, McAfee Stinger, McAfee TechMaster, McAfee
Total Protection, TrustedSource, VirusScan are registered trademarks or trademarks of McAfee, Inc. or its subsidiaries in the US and other countries.
Other marks and brands may be claimed as the property of others.
LICENSE INFORMATION
License Agreement
NOTICE TO ALL USERS: CAREFULLY READ THE APPROPRIATE LEGAL AGREEMENT CORRESPONDING TO THE LICENSE YOU PURCHASED, WHICH SETS
FORTH THE GENERAL TERMS AND CONDITIONS FOR THE USE OF THE LICENSED SOFTWARE. IF YOU DO NOT KNOW WHICH TYPE OF LICENSE YOU
HAVE ACQUIRED, PLEASE CONSULT THE SALES AND OTHER RELATED LICENSE GRANT OR PURCHASE ORDER DOCUMENTS THAT ACCOMPANY YOUR
SOFTWARE PACKAGING OR THAT YOU HAVE RECEIVED SEPARATELY AS PART OF THE PURCHASE (AS A BOOKLET, A FILE ON THE PRODUCT CD, OR A
FILE AVAILABLE ON THE WEBSITE FROM WHICH YOU DOWNLOADED THE SOFTWARE PACKAGE). IF YOU DO NOT AGREE TO ALL OF THE TERMS SET
FORTH IN THE AGREEMENT, DO NOT INSTALL THE SOFTWARE. IF APPLICABLE, YOU MAY RETURN THE PRODUCT TO MCAFEE OR THE PLACE OF
PURCHASE FOR A FULL REFUND.
Preface 5
About this guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Audience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Find product documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 Performance issues 27
Sniffer trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Data link errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Half-duplex setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Full-duplex setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
5 Error messages 93
Error messages for RADIUS servers . . . . . . . . . . . . . . . . . . . . . . . . . . 93
Error messages for LDAP server . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
6 Troubleshooting scenarios 95
Network outage due to unresolved ARP traffic . . . . . . . . . . . . . . . . . . . . . . 95
Delay in alerts between the Sensor and Manager . . . . . . . . . . . . . . . . . . . . . 96
Sensor-Manager Connectivity Issues . . . . . . . . . . . . . . . . . . . . . . . . . 100
Wrong country name in IPS alerts . . . . . . . . . . . . . . . . . . . . . . . . . . 102
Wrong country name in ACL alerts . . . . . . . . . . . . . . . . . . . . . . . . . . 105
Index 117
This guide provides the information you need to configure, use, and maintain your McAfee product.
Contents
About this guide
Find product documentation
Audience
McAfee documentation is carefully researched and written for the target audience.
The information in this guide is intended primarily for:
Administrators People who implement and enforce the company's security program.
Users People who use the computer where the software is running and can access some or all of
its features.
Conventions
This guide uses these typographical conventions and icons.
Book title, term, Title of a book, chapter, or topic; a new term; emphasis.
emphasis
Bold Text that is strongly emphasized.
User input, code, Commands and other text that the user types; a code sample; a displayed
message message.
Interface text Words from the product interface like options, menus, buttons, and dialog
boxes.
Hypertext blue A link to a topic or to an external website.
Note: Additional information, like an alternate method of accessing an
option.
Tip: Suggestions and recommendations.
Task
1 Go to the ServicePortal at https://support.mcafee.com and click the Knowledge Center tab.
2 In the Knowledge Base pane under Content Source, click Product Documentation.
3 Select a product and version, then click Search to display a list of documents.
McAfee Network Security Platform is a combination of network appliances and software, built for the
accurate detection and prevention of intrusions and network misuse.
Sensors are high-performance, scalable, and flexible content processing appliances built for the
accurate detection and prevention of intrusions, misuse, malware, denial of service (DoS) attacks, and
distributed denial of service (DDoS) attacks. Sensors can be physical or virtual appliances. Sensors are
specifically designed to handle traffic at wire-speed, efficiently inspect and detect intrusions with a
high degree of accuracy, and flexible enough to adapt to the security needs of any enterprise
environment.
Network Security Platform offers several types of Sensor platforms providing different bandwidth and
deployment strategies.
M-series: M-8000, M-6050, M-4050, M-3050, M-2850, M-2950, M-1450, and M-1250
NS-series: NS9100, NS9200, NS9300, NS7100, NS7200, NS7300, NS5100, NS5200, NS3200 and
NS3100.
This section lists some troubleshooting scenarios, procedures, and checks that can be followed during
a Sensor's Return Merchandize Authorization (RMA) process.
Contents
NS-series Sensors CRUs and FRUs
View diagnostic and system information for NS-series Sensors
Lspci for NS-series Sensors
M-series Sensor replacement for defective I-series Sensors
Check XLRs for M-series Sensors
Sibytes for I-series Sensors
Check for monitoring ports failure
Check for management ports failure
Check for console port failure
Check for Sensor LED or fan failure
Check power supply in the Sensor
Check for flash corruption in the Sensor
Perform flash recovery
Cache and memory errors
Verify passive fail-open connectivity
Tasks suspended on Sibytes
PSUs
Fans
Manager displays system event message indicating which of the two PSU is bad. (NS3x00 has only 1
power supply which is FRU only.)
The following are the reasons for power supply error message
Mar 15 19:28:37 localhost tL: EMER montor|Couldn't determine power supply 1 status!
Fans
Manager displays a system event message indicating which fan FRU is in bad status. Fan number is
labeled on the system chassis.
The following image shows the system event indicating that the Fan#3 is in bad status.
For individual interface port troubleshooting, perform the usual swap test. Swap out the IO module
card itself or swap the interface port cable with a known good one. Verify if the problem continues
even after the swap. The aim is to isolate the bad IO module card, transceiver, cable, or a
particular interface port.
SSDs
NS7x00 series
Orange Beach Lite Cards
DIMMs
NS5x00 series
DIMMs
SSD
Power supply
FANs
SSDs (NS9x00)
Sensor CLI indicates which of the 2 SSD is in bad status.
SSD #0 is the top SSD (Labeled 00 or 0 on the SSD cable)
Sensor logs also contain the information indicating which SSD is in bad status.
The following Image displays the labels 00 and 01 on the SSD cable.
Lspci output.
Sensor.dbg and sensor.log file display errors instead of the following log messages:
Above is the normal system output. If any group of 4 lines are missing, then it indicates that the NIC
part of the OB card has failed. Each group of 4 lines represent the OB card on each of the 4 Xeon CPUs
in the system. For example, if the third group of 4 lines are missing, replace the OB card on the third
Xeon CPU PCIe slot.
It is possible for just one line to be missing from the 4 line groups. In such a case, the entire OB card
has to be replaced since each OB card represents 4 line group.
Lspci output.
Sensor.dbg and sensor.log file display errors instead of the following log messages:
83:08.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
c2:00.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
c3:00.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
c3:01.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
c3:08.0 PCI bridge: PLX Technology, Inc. Device 8724 (rev ba)
Above is the normal system output. If any group of 4 lines are missing, then it indicates that the NIC
part of the OB card has failed. Each group of 4 lines represent the OB card on each of the 4 Xeon CPUs
in the system. For example, if the third group of 4 lines are missing, replace the OB card on the third
Xeon CPU PCIe slot.
It is possible for just one line to be missing from the 4 line groups. In such a case, the entire OB card
has to be replaced since each OB card represents 4 line group.
Sample output
KR-9100# lspci | grep 434
09:00.0 Co-processor: Intel Corporation Device 0434 (rev 21)
49:00.0 Co-processor: Intel Corporation Device 0434 (rev 21)
81:00.0 Co-processor: Intel Corporation Device 0434 (rev 21)
c1:00.0 Co-processor: Intel Corporation Device 0434 (rev 21)
Above is the normal system output. If any one line is missing in the output, then it indicates that the
crypto device on the OB card has failed. Each line represent the OB card on each of the 4 CPUs in the
system. For example, if the second line is missing, then the OB card on the second Xeon CPU PCIe slot
has to be replaced.
NS7x00 series Sensors have 1 or 2 OB Lite cards installed compared to NS9x00 series Sensors that
have 4 OB cards installed. The debug method is identical to that of OB Cards in NS9x00 series
Sensors.
If there is an error with this card, the Sensor reboots and does not come back to working condition. To
debug, it is required to have console access to capture the output.
Sensor.dbg and sensor.log file will displays errors instead of the following informational messages:
Mar 6 22:45:28 localhost tL: EMER sysctl|chkCaveCreekVersionAndCount: *** ERROR *** NOT ALL
CAVE CREEK CO-PROCESSORS DETECTED, EXPECTED 2 , AVAILABLE 1
.
Error, 4 NIC cards not detected
Mar 6 22:45:28 localhost tL: EMER sysctl|*********************
Mar 6 22:45:28 localhost tL: EMER sysctl| NIC CARDS NOT DETECTED
Mar 6 22:45:28 localhost tL: EMER sysctl|*********************
.
Error, 2 Crypto Chips not detected
Feb 18 02:21:12 localhost tL: EMER sysctl|*********************
Mar 6 22:45:28 localhost tL: EMER sysctl| Crypto Chips NOT DETECTED
Feb 18 02:21:12 localhost tL: EMER sysctl|*********************
Above is the normal system output. If any group of two lines are missing, then it indicates that the
NIC part of the OB Lite card has failed. Each group of two lines represent the OB Lite card on each of
the 2 Xeon CPUs in the system. For example, if the second group of two lines are missing, then
replace the OB Lite card on the second Xeon CPU PCIe slot.
It is possible for just one line to be missing from the two line groups. In such a case, the entire OB
Lite card has to be replaced since each OB Lite card represents both lines in the two line group.
Above is the normal system output. If any one line is missing, then it indicates that the crypto device
in the OB Lite card has failed. Each line represent the OB Lite card on each of the 2 CPUs in the
system. For example, if the second line is missing, then the OB Lite card on the second Xeon CPU PCIe
slot has to be replaced.
DIMMs
DIMM errors are identified by the following error messages in the /var/log/messages file.
Jan 21 12:15:01 localhost klogd: [ 749.407598] [Hardware Error]: Run the message through
'mcelog --ascii' to decode.
Jan 21 12:15:01 localhost klogd: [ 749.416032] [Hardware Error]: No human readable MCE
decoding support on this CPU type.
To pin point which DIMM is bad, go into the system BIOS menu and check DIMM status under the
memory configuration page.
To view diagnostic and system information, run the command run diag_show_system_info.
Syntax:
run diag_show_system_info
Sample output
The run should be successful with no errors seen. The temperature and fan speed should be within
range.
Power supply health should either be OK or N/A. Diagnostic result should display as DIAGNOSTIC
PASSED!.
If any other value exits, it indicates that an issue exists. run run diag_pld_test
Sample output
0x01: BCM54980_SUPER_ISOLATE
Scratch pad : 0x00
DIAGNOSTIC PASSED!
If the diagnostic result is not passed and error messages are present then it indicates that a problem
exists in the CPLD device.
Sample output
Sample output
If there are not 16 lines in the output, then it indicates that a problem exists in one of the PLX device.
If a particular I-Series Sensor is not in the inventory, a replacement M-Series Sensor should be sent to
the customer. Below is matrix of the list of M-series Sensor models that should sent as a replacement
for I-Series Sensor models.
Errors seen:
The following error is seen in sensor.log
Ideally the value of XLRs should all be 32. In the above example XLRC is 0. In any use case either could
be zero.
Troubleshooting Steps:
Power cycle (not reboot) the Sensor in order to initialize the XLRs. Even after power cycle if the same
errors are seen as above, it signifies that the XLR is dead and RMA needs to be performed for the
Sensor.
Errors seen
The following error is seen in sensor.log.
Troubleshooting steps
1 Telnet Sibytes 127.4.x.1, where x could vary from 1 to 8 depending on the Sensor model.
2 Power cycle (not reboot) the Sensor in order to initialize the Sibytes.
3 After the power cycle if the same errors exists as above, it signifies that the Sibytes are dead and
the RMA has to be performed for the Sensor.
Task
1 Check for faulty cables and replace with known good ones.
3 Check speed/duplex settings through the Sensor CLI and ensure that they match those on the
switch and the end device to which it is connected.
4 Check for CRC errors on the interface ports. If CRC errors are incrementing then they may be
causing the link/port failure.
Task
1 Check for faulty cables and replace with known good ones.
2 Check the speed/duplex settings through the Sensor CLI and ensure that they match those on the
switch and the end device to which it is connected.
Task
1 Connect the console port of the Sensor to a Windows PC using a known good console cable and
open hyper terminal window with the settings as shown below:
2 If a blank screen is displayed, use null mode cable and connect it to the AUX port of the Sensor.
Task
1 If the LED on the Sensor's front panel is not turned on when it should have been, check if it is
physically there by shining a light into the enclosure.
2 If the LED is present, check the Manager for errors e.g. temperature warning, fan error etc. and
rectify the error accordingly
3 If there are no errors, then the LED could be faulty. RMA the Sensor on customer desecration.
4 If the fan status LED is off or in displays in amber color , physically check the fan and verify if it is
running or not.
b If the fan still does not run, RMA the Sensor. If the fan runs, then RMA the faulty fan module.
Below is the SKU associated to the list of Sensors for which the fan module is field replaceable.
Table 1-2
Model SKU
M-2750 IAC-N450-FAN
M-2850 IAC-N450-FAN
M-2950 IAC-N450-FAN
M-3030
M-3050 IAC-MSER-FAN
M-4030 IAC-MSER-FAN
M-4050 IAC-MSER-FAN
M-6030 IAC-MSER-FAN
M-6050 IAC-MSER-FAN
M-8000 IAC-MSER-FAN
N-450 IAC-N450-FAN
N-550 IAC-N450-FAN
NS3100 IPS-NS3100
NS3200 IPS-NS3200
NS5100 IPS-NS5100
NS5200 IPS-NS5200
NS7100 IPS-NS7100
NS7200 IPS-NS7200
NS7300 IPS-NS7300
NS9100 IPS-NS9100
NS9200 IPS-NS9200
NS9300 IPS-NS9300
Task
1 If the Sensor does not power on, replace the power supply with a known good spare power supply.
2 In case there are dual power supplies and LED for a power supply turns from amber to green or
turns off completely check the dashboard for error messages. If the power supply error is seen,
replace the faulty power supply with a known good power supply.
3 In case any of the following Sensors do not power up using a single power supply unit, then RMA
the Sensor:
I-1200
I-1400
I-2600
Task
1 Download the netboot procedure to recover flash.
2 If internal recovery fails then use external flash recovery (see KB50046).
3 In case recovery from netboot fails, use the external recovery flash card to recover the Sensor. See
KB50046 to recover Sensor from external flash card.
During boot-up if the following message is seen on the console: RMA can be performed
Err - no DIMMs found. for the Sensor
Task
1 Verify the Sensor connectivity with peer devices.
2 Verify the fail-open kit connectivity with known good cables to Sensor and peer device.
3 If with fail-open kit connectivity is not available for gigabit fail open kit verify the Tx and Rx side of
the cables by checking for a red light (for Tx cable), and no light for (Rx cable). If different then
swap on one side only.
4 If the connectivity is not available, change the fail-open kit including the controller card with spare
known good units.
Errors seen
The following error (or similar error) is seen in sensor.log.
Troubleshooting steps
1 Login to the Sensor using nobrk1n and then telnet into the sibytes.
error_code == 0x6
error_code==0x7
Bits 8 to 15 of register 0x100208C8 ("Memory & I/O Error Counter Register:) is non-zero
The bit 0 is on the right and you need to move to the left to check other bits.
Most performance issues are related to switch port configuration, duplex mismatches, link up/down
situations, and data link errors.
Contents
Sniffer trace
Data link errors
Sniffer trace
A Sniffer details packet transfer, and thus a Sniffer trace analysis can help pinpoint switch and McAfee
Network Security Platform performance or connectivity issues when the issues persist after you have
exhausted the other suggestions in this document. Sniffer trace analysis reveals every packet on the
wire and pinpoints the exact problem.
Note that it may be important to obtain several Sniffer traces from different ports on different
switches, and that it is useful to monitor ("span") ports rather than spanning VLANs when
troubleshooting switch connectivity issues.
Half-duplex setting
When operating with a duplex setting of half-duplex, some data link errors such as FCS, alignment,
runts, and collisions are normal. Generally, a one percent ratio of errors to total traffic is acceptable
for half-duplex connections. If the ratio of errors to input packets is greater than two or three percent,
performance degradation may be noticeable.
In half-duplex environments, it is possible for both the switch and the connected device to sense the
wire and transmit at exactly the same time, resulting in a collision. Collisions can cause runts, FCS,
and alignment errors, which are caused when the frame is not completely copied to the wire, resulting
in fragmented frames.
Full-duplex setting
When operating at full-duplex, FCS, cyclic redundancy checks (CRC), alignment errors, and runt
counters should be minimal. If the link is operating at full-duplex, the collision counter is not active. If
the FCS, CRC, alignment, or runt counters are incrementing, check for a duplex mismatch. Duplex
mismatch is a situation in which the switch is operating at full-duplex and the connected device is
operating at half-duplex, or vice versa. The result of a duplex mismatch is extremely slow
performance, intermittent connectivity, and loss of connection. Other possible causes of data link
errors at full-duplex are bad cables, a faulty switch port, or software or hardware issues.
This section lists methods for determining and reducing false positives.
Contents
Reduce false positives
Tune your policies
There are two stages to this process: initial policy configuration and policy tuning.Though these are
tedious tasks, McAfee has extended its blocking options to include SmartBlocking, which only activates
blocking when high confidence signatures are matched, thus minimizing the possibility of false
positives.Network Security Platform is replacing its present Recommended for Blocking (RFB)
designation with Recommended for SmartBlocking (RFSB) because this new level of granularity
enables McAfee to recommend many more attacks the list of RFB attacks is a subset of the list of
RFSB attacks.
The ultimate goal of policy tuning is to eliminate false positives and noise and avoid overwhelming
quantities of legitimate, but anticipated alerts.
We ask that you set your expectations appropriately regarding the elimination of false positives and
noise. A proper Network Security Platform implementation includes multiple tuning phases. False
positives and excess noise are routine for the first 3 to 4 weeks. Once properly tuned, however,
they can be reduced to a rare occurrence.
When initially deployed, Network Security Platform frequently exposes unexpected conditions in the
existing network and application configuration. What may at first seem like a false positive might
actually be the manifestation of a misconfigured router or Web application, for example.
Before you begin, be aware of the network topology and the hosts in your network, so you can
enable the policy to detect the correct set of attacks for your environment.
Take steps to reduce false positives and noise from the start. If you allow a large number of "noisy"
alerts to continue to sound on a very busy network, parsing and pruning the database can quickly
become cumbersome tasks. It is preferable to all parties involved to put energy into preventing
false positives than into working around them. Exception objects are also an option where you can
have custom rule sets specific to his environment. You can disable all alerts that are obviously not
applicable to the hosts that you protect. For example, if you use only Apache Web servers, you can
disable IIS-related attacks.
With Network Security Platform, there are three types of alerts which are often taken as "false
positives:"
Incorrect identification
These alerts typically result from overly aggressive signature design, special characteristics of the user
environment, or system bugs. For example, typical users will never use nested file folders with a path
more than 256 characters long; however, a particular user may push the Windows' free-style naming
to the extreme and create files with path names more than 1024 characters. Issues in this category
are rare. They can be fixed by signature modifications or software bug fixes.
your network: Relevance analysis involves the analysis of the vulnerability relevance of real-time
alerts, using the vulnerability data imported to Manager database. The imported vulnerability data can
be from Vulnerability Manager or other supported vulnerability scanners such as Nessus.The fact that
the attack failed can help in zero in on the type of Web server you use. Users can also better manage
this type of events through policy customization or installing attack filters.
The noise-to-incorrect-identification ratio can be fairly high, particularly in the following conditions:
the configured policy includes a lot of Informational alerts, or scan alerts which are based on
request activities (such as the All Inclusive policy)
deployment links where there is a lot of hostile traffic, such as in front of a firewall
overly coarse traffic VIDS definition that contains very disparate applications, for example, a highly
aggregated link in dedicated interface mode
Users can effectively manage the noise level by defining appropriate VIDS and customize the policy
accordingly. For dealing with exceptional hosts, such as a dedicated pentest machine, alert filters can
also be used.
What did you expect to see? What is the vulnerability, if applicable, that the attack indicated by the
alert is supposed to exploit?
Ensure that you capture valid traffic dumps that are captured from the attack attempt (for
example, have packet logging enabled and can view the resulting packet log)
Determine whether any applications are suspected of triggering the alertwhich ones, which
versions, and in what specific configurations.
If you intend to work with McAfee Technical Support on the issue, we ask that you provide the
following information to assist in troubleshooting:
If this occurred in a lab using testing tools rather than live traffic, please provide detailed
information of the attack/test tool used, including its name, version, configuration and where the
traffic originated.
If this is a testing environment using a traffic dump relay, make sure that the traffic dumps are
valid, TCP traffic follows a proper 3-way handshake, and so on.
Also, please provide detailed information of the test configuration in the form of a network
diagram.
Be ready to tell Technical Support how often you are seeing the alerts and whether they are
ongoing.
This section lists the system fault messages visible in the Manager Operational Status viewer,
organized by severity, with Critical messages first, then Errors, then Warnings, then Informational
messages.
You can view the faults from the Operational Status menu in Manager. For more information, see fault
messages for Vulnerability Manager Scheduler and Automatic report import using Scheduler, McAfee
Network Security Platform Integration Guide.
The fault messages you might encounter, their severity, and a description, including information on
what action clears the fault are briefed. In many cases, the fault clears itself if the condition causing
the fault is resolved. In cases where the fault does not clear, you must acknowledge or delete it to
dismiss it.
For Sensor faults, go through Manager and Sensor faults. Similarly for NTBA issues, refer to Manager
and NTBA faults.
Contents
Manager faults
Sensor faults
NTBA faults
Manager faults
The Manager faults can be classified into critical, error, warning, and informational. The Action column
provides you with troubleshooting tips.
Dropping alerts and Critical <Percentage value>% capacity. Please perform maintenance
packet logs Dropping alerts and packet logs. operations to clean and tune
the database.
DXLService is down Critical The DXLService is down due to: Check the connectivity
Failure to connect to the between IPS and ePO, or
ePolicy Orchestrator Server. check the logs.
Failure to connect to the Data Check the connectivity
eXchange Layer. between IPS and Data
eXchange Layer, or check
Failure to start the McAfee
the logs.
Agent service.
Check the logs.
Failure to start the Data
eXchange Layer service. Check the logs.
Fan error Critical The fan has failed. Check the fan LEDs on the
front of the device to ensure
all internal fans are
functioning. The fault clears
when the temperature falls
below its internal low
temperature threshold.
Geo IP location file Critical Cannot push Geo IP location file Occurs when the Manager
download failure to device <Sensor_name>. See cannot push the Geo IP
system log for details. Location file to a Sensor.
Could result from a network
connectivity issue.
Hardware error Critical This is a Generic Hardware Check the device to know
related error in the device. more.
Incompatible Critical One or more custom attack The Custom Attack Editor
custom attack definition is incompatible with indicates which definitions
the current signature set. Error are incompatible.
message: <exception string>. (Incompatibility could result
from attack or signature
overlap.) Update the
definition in the Custom
Attack Editor and try again.
Incompatible UDS Critical A user-defined signature (UDS) You will need to edit your
signature is incompatible with the current existing UDS attacks to
signature set. make them conform to the
new signature set
definitions. Bring up the
Custom Attack Editor (IPS
Settings > Advanced Policies
> Custom Attack Editor) and
manually performing the
edit / validation.
This fault clears when a
subsequent UDS compilation
succeeds.
Link failure of Critical The link between this port and This is a connectivity issue.
<Sensor> the external device to which it is Contact your IT department
connected is down. to troubleshoot network
connectivity. This fault
clears when communication
is re-established.
Low JVM Memory Critical The Manager is experiencing Reboot the Manager server.
high memory usage. Available
system memory is low.
Low Tomcat JVM Critical The Manager is experiencing Reboot the Manager server.
Memory high memory usage. Available
system memory is low.
The Manager has Critical The Manager server is in If the Manager server has
moved to MDR Standby mode(MDR action) and moved to Stand by, then
mode, and this active peer Manager does not make Central Manager with
Manager cannot have Central Manager latest Manager information
handle the change information as Active or reform MDR; if
the Manager has moved to
Standby, then make the
Manager with Central
Manager information as
Active or make sure that
active Central Manager or
Manager has latest
configuration data.
There is conflict in Critical The configuration between an Dissolve and recreate an
the MDR existing MDR pair (Manager 1 MDR pair.
configuration for and Manager 2 - both Managers
the Manager are Central Manager configured)
<Manager_name> is disabled and a new MDR pair
configuration has been created
with Manager 2 and Manager 3.
Manager 2 is in Standby mode
and Manager 3 does not have
Central Manager configuration
The MDR Critical The communication from Please look into the
connection is down. <Primary/Secondary> to connection statuses of the
<Secondary/Primary> is down. systems and manager logs.
Vulnerability Manager configuration
Scheduled Critical This message indicates that the Refer to error logs for
Vulnerability vulnerability data import by the details
Manager Scheduler from Vulnerability
vulnerability data Manager database has failed.
import failed
Vulnerability data Critical Scheduled import of This message is
import from vulnerability data failed from informational.
Vulnerability FoundStone database server
Manager failed into ISM database table
Network Security Critical Port conflict in Network Security Free this port for Network
Central Manager Central Manager UDS Security Central Manager
UDS signature synchronization. Port already in synchronization to succeed.
synchronization use by UDS. Free this port for
failed Central Manager synchronization
to succeed.
Licensing
License expires Critical Indicates that your Network Contact
soon Security Platform license is licensing@mcafee.com for a
about to expire; this fault first current license. This fault
appears 7 days prior to clears when the license is
expiration. current. Please contact
Technical Support or your
local reseller.
License expired Critical Indicates that your Network Contact
Security Platform license has licensing@mcafee.com for a
expired. current license.
This fault clears when the
license is current.
MLC Server Error Manager has no connection to Indicates that the Manager
Connection Error configured MLC server. has no connection to the
configured MLC server. This
can be due incorrect
certificate import, network
connectivity issues or issues
internal to the MLC server.
Refer to the MLC integration
documentation for more
information.
Mail server and queue
Alert queue full Error The Manager has reached its limit Indicates that the Manager
<queue_size_limit> for alerts that has reached the limit
can be queued for storage in the (default of 100,000) of
database. (<no_of_alerts> alerts alerts that can be queued
dropped) for storage in the database.
Alerts are being detected by
your Sensor(s) faster than
the Manager can process
them. This is evidence of
extremely heavy activity.
Check the alerts you are
receiving to see what is
causing the heavy traffic on
the Sensor(s).
E-mail server Error Connection attempt to e-mail server This fault indicates that the
unreachable <mail server> failed. Error: SMTP mailer host is
<Messaging Exception String>. unreachable, and occurs
when the Manager fails to
send an email notification or
a scheduled report. This
fault clears when an attempt
to send the email is
successful.
Packet log queue full Error The Manager packet log queue has The Manager packet log
reached its maximum size of queue has reached its
<pktlog_queue_size_limit>. maximum size (default
(<no_of_pktlogs_dropped> 200,000 packets), and is
packets) unable to process packets
until there is space in the
queue. Packets are being
detected by your Sensor(s)
faster than the Manager can
process them. This is
evidence of extremely heavy
activity. Check the packets
you are receiving to see
what is causing the heavy
traffic on the Sensor(s).
Packet capturing error Error The device detected an error Device shall attempt to
connecting to the SCP server while automatically recover. Check
attempting to transfer a packet Packet Capture
capture file. configuration.
The device is unable to send the
packet capture file via SCP.
The device has stopped capturing
packets due to insufficient internal
memory.
The device experienced an internal
error while performing the packet
capture.
The device is unable to authenticate
with target server to transfer a
packet capture file.
Queue size full Error The Manager alert queue has Check the alerts you are
reached its maximum size (default receiving to see what is
200,000 alerts), and is unable to causing the heavy traffic on
process alerts until there is space in the Sensor(s).
the queue. Alerts are being detected
by your Sensor(s) faster than the
Manager can process them. This is
evidence of extremely heavy
activity.
The Manager alert slow consumer The Manager alert slow
(SNMP Trap forwarder) queue has consumer (SNMP Trap
reached its maximum size of alerts forwarder) queue has
dropped) reached its maximum size,
and is unable to forward
alerts until there is space in
the queue. Alerts are being
detected by your Sensor(s)
faster than the Manager can
process them. This is
evidence of extremely heavy
activity. Check the alerts you
are receiving to see what is
causing the heavy traffic on
the Sensor(s).
Syslog Server Error Connection attempt to Syslog server This fault indicates that the
unreachable <server address> failed. Error: Syslog Server is
<Syslog TCP connection failed>. unreachable, and occurs
when the Manager fails to
send an syslog notification.
This fault clears when an
attempt to send the syslog
is successful.
Failed to backup IDS Warning Failed to backup Policy. Delete previous versions.
Policy
Warning Failed to backup Policy. Please contact technical
support or local reseller.
Failed to backup Warning Failed to backup Policy. Please contact technical
Recon Policy support or local reseller.
Warning Failed to backup Policy. Delete previous version.
Initiating Audit Log Warning The Audit Log capacity of the Manager This fault will be raised
file rotation was reached, and the Manager will after a configured number
begin overwriting the oldest records of records written. No
with the newest records (i.e. first in action is required.
first out). The capacity is configured
The fault indicates the number of in the iv_emsproperties
records that have been written to the table in MySQL; this option
audit log; and equal number of audit can be turned off. If this
log records are now being overwritten. feature is enabled, when
disk capacity is reached or
audit log capacity is
reached, then Audit Log
rotation is initiated.
Invalid Malware File Warning The available free disk space on the Reduce the maximum disk
Archive Storage Manager is less than the disk space space allowed for one or
Settings required to support the current more file type.
malware storage settings.
MLC IP - User Warning Currently, NSM-MLC integration Check the MLC server
mapping/User count supports only 100000 IP-user mapping configured with this
exceeds limit and 75000 users. One of these has Manager. Consider reducing
exceeded, so the device behavior the number of users/
cannot be guaranteed until these computers that is
numbers are brought down. monitored by MLC.
Packet capture Warning The device is near capacity. Packet Check Packet Capture
complete captures might not capture all packets. configuration and restart if
required.
Policy Update Failed Warning Failed to update following policies Please edit the policy to fix
during Signature Set import. Please the issue.
edit the policy to fix the issue.
System startup in Warning System startup restored alerts from Attack Log page may not
progress; alerts the archive file. Attack Log page may not show all alerts.
being restored show all alerts.
Vulnerability Manager configuration
IPS policy backup Warning Failed to back up policy See ems logs.
failure <policy_name>.
Packet Log Archival Informational Indicates that the packet log This message is for
state has changed archival state has changed user information. No
action required.
Scheduler - Signature Informational Scheduler - Signature download This message is for
download from Manager from Manager to Sensor has failed. user information. No
to Sensor failed action required.
Sensor software image Informational A Sensor software image or This message is for
or signature set import signature set file is in the process of user information. No
in progress being imported from the Network action required.
Security Platform Update Server to
the Manager server.
Informational This message is for
user information. No
action required.
Signature set update Informational Signature set update failed while This message is for
failed transferring from the Manager user information. No
server to the Sensor. action required.
Signature set update Informational The attempt to update the You must re-import a
not successful signature set on the Manager was signature set before
not successful, and thus no performing any action
signature set is available on the on the Manager. A
Manager. valid signature set
must be present before
any action can be
taken in Network
Security Platform.
Switchback has been Informational N/A This message is for
completed, the primary user information. No
Manager has got the action required.
control of Sensors now
Scheduled Vulnerability Informational Scheduled Vulnerability Manager Refer to error logs for
Manager vulnerability vulnerability data import has failed details
data import failed
Vulnerability data Informational This message indicates that the
import from McAfee vulnerability data import from
Vulnerability Manager McAfee Vulnerability Manager
database was successful database is successful.
For more information on importing
vulnerability data reports in
Manager, see Importing
Vulnerability Scanner Reports,
McAfee Network Security Platform
Integration Guide.
MDR manual switch Informational Manager Disaster Recovery initiated This message is for
over successful; the via a manual switchover, is user information. No
Secondary <Manager/ successfully completed. Secondary action required.
Central Manager> is in Manager is now in control of
control of <Sensors/ Sensors.
Manager>
MDR automatic Informational Manager Disaster Recovery Failover has occurred;
switchover has been switchover has been completed; the the Secondary
completed; the Secondary Manager is in control of Manager is now in
Secondary <Manager/ Sensors. control of the Sensors.
Central Manager> is in Troubleshoot problems
control of <Sensors/ with the Primary
Manager> Manager and attempt
to bring it online again.
Once it is online again,
you can switch control
back to the Primary.
MDR configuration Informational Manager Disaster Recovery This message is for
information retrieval Secondary Manager has user information. No
from Primary Manager successfully retrieved configuration action required.
successful information from the Primary
Manager.
MDR forced switch over Informational Manager Disaster Recovery is This message is for
has been completed; completed via a manual switchover. user information, no
the Secondary Secondary Manager is now in action required.
<Manager/Central control of Sensors.
Manager> is in control
of <Sensors/Manager>
MDR operations have Informational Manager Disaster Recovery This message is for
been resumed functionality has been resumed. user information, no
Failover functionality is again action required.
available.
MDR operations have Informational Manager Disaster Recovery This message is for
been suspended functionality has been suspended. user information, no
No failover will take place while action required.
MDR is suspended.
MDR switchback has Informational Manager Disaster Recovery This message is for
been completed; the switchback has been completed; user information, no
Primary <Manager/ the Primary Manager has regained action required.
Central Manager> is in control of Sensors.
control of <Sensors/
Manager>
Sensor faults
The Sensor faults can be classified into critical, error, warning, and informational. The Action column
provides you with troubleshooting tips.
Failover peer status Critical This fault indicates whether the This fault clears automatically
Sensor peer is up or down. when the Sensor peer is up.
Fan error Critical One or more of the fans inside On the I-4000, you can also
the Sensor have failed. check the Sensor's front panel
For the I-4000 and 4010, the LEDs to see which fan has failed.
Manager indicates which fan has If a fan is not operational,
failed. McAfee strongly recommends
powering down the Sensor and
contacting Technical Support to
schedule a replacement unit.
In the meantime, you can use an
external fan (blowing into the
front of the Sensor) to prevent
the Sensor from overheating
until the replacement is
completed.
<Sensor_name> Critical The attempt by the Manager to The Manager cannot push the
configuration update deploy the configuration to original device configuration
failure device <Sensor_name> failed during device re-initialization.
during device re-initialization. This can also occur when a failed
The device configuration is now device is replaced with a new
out of sync with the Manager unit, and the new unit is unable
settings. The device may be to discover its configuration
down. See the system log for information.
details.
Sensor reboot Critical User-configured SSL decryption Reboot the Sensor to cause the
required for SSL settings for a particular Sensor changes to take effect.
decryption changed, requiring a Sensor
configuration change reboot.
Signature set error Critical The device has detected an error Ensure that the Sensor is online
on signature segment and in good health. The Manager
<segment_id>. The segment will make another attempt to
error cause is <unknown push the file to the Sensor. This
cause>, and the download type fault will clear with the signature
is <init/update/unknown segments are successfully
signature download type>. pushed to the Sensor.
Solid State Drive Critical The solid state drive <drive 0> is Check the respective SSD status,
<drive 0> Error <drive 1>. on failure replace the SSD.
Sensor switched to Critical The Sensor has moved from The Sensor will remain in Layer
Layer 2 mode detection mode to Layer 2 2 mode until it is rebooted.
(Passthru) mode. This indicates
that the Sensor has experienced
the specified number of errors
within the specified timeframe
and Layer 2 mode has triggered.
Sensor switched to Critical Sensor is now operating in The Sensor has experienced
Layer 2 Bypass mode Layer2 Bypass mode. Intrusion multiple errors, surpassing the
detection/prevention is not configured Layer2 mode
functioning. threshold. Check the Sensor's
status.
Software error Critical A recoverable software error has This error may require a reboot
occurred within the device. A of the Sensor, which may then
device reboot may be required. resolve the issue causing the
fault.
SSL decryption key Critical Cannot push SSL decryption keys Occurs when the Manager
download failure to device <Sensor_name>. See cannot push the SSL decryption
system log for details. keys to a Sensor. Could result
from a network connectivity
issue.
User login via Critical Sensor reports user This message is informational.
console after Sensor <user_name> login via console
initialization after Sensor initialization. This is
a FIPS 140-2 Level 3 violation.
Advanced Threat Defense connectivity
Sensor connectivity Critical Sensor is unable to communicate Message generated based on
status with Advanced with Advanced Threat Defense Sensor Connectivity with
Threat Defense (ATD) device due to . This fault Advanced Threat Defense (ATD)
device will be cleared when connection device.
is restored.
CADS connectivity
Sensor connectivity Critical Sensor is unable to communicate Message generated based on
status with CADS with CADS device due to Sensor Connectivity with CADS
device <issue>. This fault will be device.
cleared when connection is
restored.
Licensing
Device discovered Critical Device <Sensor_name> To obtain a permanent license
without license discovered without license, and now, kindly contact Technical
may not detect attacks. Support or your local reseller.
Device discovered Critical Device <Sensor_name> was
with cluster discovered with a cluster
secondary license. secondary license. This device
not be connected to the Manager
directly.
Device license Critical Device license expired. The
expired device may not detect attacks.
Device support Critical Device support license expired.
license expired The device may not detect
attacks.
Expired device Critical Device license expired. The
license device may not detect attacks.
Expired device Critical Device support license expired.
support license The device may not detect
attacks.
Expired license for Critical The device may not detect Please contact technical support
device of type attacks. or your local reseller to obtain a
<device_type> License.
Expired support Critical The device may not detect
license for device of attacks.
type <device_type>
Device in bad Error Please check the running status of If this fault persists, we
health device <device_name>. This fault recommend that you perform a
occurs with any type of device Diagnostic Trace and submit the
software failure. (It usually occurs in trace file to Technical Support for
conjunction with a software error troubleshooting.
fault.)
Game error Error Indicates that the engine could not be This fault clears when the engine
initialized or downloaded and also if could be initialized or
the Dat file could not be downloaded. downloaded and also if the Dat
file can be downloaded.
Internal packet Error Device is dropping packets due to Reduce the amount of traffic
drop error traffic load. passing through the Sensor as
this fault indicates overload of
traffic on the Sensor.
MLC Bulk update Error Device has a limit for the MLC Bulk Check the MLC server configured
file size exceeds Update file size that it can process. As in this Manager for the number
limit this has exceeded, update to the of users, groups, and IP user
device <Sensor_name> is aborted. mappings. Make sure they do
not exceed the limits specified in
the MLC Integration
documentation.
Out-of-range Error Device <Sensor_name> has detected Contact McAfee Technical
configuration an out-of-range configuration value. Support for assistance.
Put peer DoS Error The Sensor was unable to push a See the ems.log file for details
profile failure requested profile to the Manager. on why the error is occurring.
The fault will clear when the
Sensor is able to push a valid
DoS profile.
Peer DoS profile Error Peer DoS profile retrieval request The Manager cannot obtain the
retrieval failure from device <Sensor_name> failed. requested profile from the peer
No DoS profile for peer Sensor, nor can it obtain a saved
<peer_Sensor_name> is available. valid profile. See log for details.
Sensor reports Error The Manager received a value from This fault does not clear
an out-of-range the Sensor that is invalid. The automatically; it must be cleared
configuration additional text of the message manually.
contains details. Contact McAfee Technical
Support for assistance.
Sensor reports Error NMS user privacy key decryption Please delete NMS user and add
NMS user failed for user <user_name>. again with valid credential.
privacy key
decrypt failure
Sensor reports Error NMS user authentication key Please delete NMS user and add
NMS user decryption failed for user again with valid credential.
authentication <user_name>.
key decrypt
failure
Sensor Error The Sensor configuration update Please see ems.log file to isolate
configuration failed to be pushed from the Manager reason for failure.
update failed Server to the Sensor.
SSL decryption Error The Manager detects that a particular Re-import the key (which is
key invalid SSL decryption key is no longer valid. identified within the error
The detailed reason why the fault is message). The fault will clear
occurring is shown in the fault itself when the key is determined
message. These reasons can range to be valid.
from the Sensor re-initializing itself
with a different certificate to an
inconsistency between the decryption
key residing on a primary Sensor and
its failover peer Sensor.
Trust Error Device <Sensor_name> could not be Make sure the shared secret
Establishment added to the Manager because the entered on the device CLI
Error Bad shared secret it provided does not matches the one defined within
Shared Secret match what was defined for it on the the Manager GUI. (Note: The
Manager. shared secret is case sensitive.)
Device in high Warning Device high latency mode is currently The device will
latency mode <LatencyConflict/ attempt to
LatencyConflictCleared>. (The device will automatically
attempt to automatically recover from recover from
the high latency condition.) the high latency
Device high latency mode and Layer 2 condition.
bypass mode are currently
<LatencyConflict/
LatencyConflictCleared>. (the device will
attempt to automatically recover from
the high latency condition.)
Signature set
Deprecated Warning The Manager has detected the following These
applications use of deprecated applications in firewall applications
detected in policies: <Deprecated Application must be
firewall policies <app_name> used in Policy removed from
<policy_name>/Rule#<ruleOrderNum> the firewall
Deprecated Application <app_name> policies.
used in Rule Element(of type Application
Group) <rule_name>@<policy_name>/
Rule# <ruleOrderNum>>
Successful automatic Informational A new callback detector set has This message is for
callback detectors recently been downloaded from the user information, no
deployment GTI Server to the Manager and is action required.
being deployed to the devices.
User login via console Informational Sensor reports user login via This message is
after Sensor console after Sensor initialization. informational.
initialization This is a FIPS 140-2 Level 3
violation.
Licensing
Device discovered with Informational Device <Sensor_name> was Renew the license
license discovered with a license that will before expire.
expire on <date>.
License detected for Informational License valid until <date>. Renew the license
<Sensor_name> of type before it expires.
Device discovery
The <NTBA Appliance/ Informational The Manager is in the process of Wait for the discovery
Sensor>, discovering the device. of the device to
<device_name> The complete.
<NTBA Appliance/
Sensor>,
<device_name>
discovery in progress
Download software
Device software image Informational Device software image is in the This message is for
download in progress process of downloading from the user information. No
McAfee Update Server to the action required.
Manager server.
NTBA faults
The NTBA faults can be classified into critical, error, warning, and informational. The Action column
provides you with troubleshooting tips.
NTBA Sigset Mismatch Error There has been a mismatch between Please check for the
Error the NTBA version <tba_sw_version> status of the follow-up
and the sigset version NTBA configuration
<sigset_version>. NSM will now try to update.
automatically push the appropriate
matching sigset.
TrustedSource
NTBA <TrustedSource Error <TrustedSource Error> Please re-check the
Error> TrustedSource
configuration.
This section lists the error messages displayed in McAfee Network Security Manager (Manager).
Contents
Error messages for RADIUS servers
Error messages for LDAP server
The table lists the error messages displayed in the User Activity Audit report.
The table lists the error messages displayed in the User Activity Audit report.
Contents
Network outage due to unresolved ARP traffic
Delay in alerts between the Sensor and Manager
Sensor-Manager Connectivity Issues
Wrong country name in IPS alerts
Wrong country name in ACL alerts
Data/Information Collection
1 Check if the attack ARP MAC Address Flip-Flop is disabled from the policy.
Go to Policy | Intrusion Prevention | Policy Types | IPS Policies. Click on Default Prevention listed in IPS Policies
name column.
Check the policy on the entire device interfaces and make sure ARP flip flop alert is either disabled
or not included in the policy on the entire device interfaces.
3 Check if ARP spoofing is enabled on the Sensor. Use the command show arp spoof status.
Explanation
When heuristic web application server protection is enabled, the Manager caching is disabled and only
selected attacks are pushed to the Sensor. If the MAC Flip-Flop attack is not part of the attacks chosen
by the user, the Sensor drops the ARP packets. This happens in scenarios such as:
For the firewall in failover mode which uses the Virtual MAC address, the IP address remains the
same but the MAC address will change
Troubleshooting Steps
1 Disable ARP spoofing on the Sensor. Use the command arp spoof to disable ARP spoofing.
2 Disable Heuristic Web Application Server Protection on the devices individual interfaces.
If the problem still persists, contact McAfee Support for further assistance.
Data/Information Collection
1 Execute the following commands on the Sensor :
status (execute 5 times in 10 seconds duration)
Also execute the same commands on a similar model Sensor, which does not have the issue.
3 Collect the attack csv file for this Sensor from the Attack Log page.
4 Collect the alert archival for the last 24 hour time duration.
6 Create/collect the network diagram that clearly indicates where the Sensor and the Manager are
located.
Troubleshooting steps
1 Check if there are any network connectivity issues or any delay in the network. If there is a delay
in the network between the Sensor and the Manager, it can lead to low alert rates.
2 Verify that the entire link between the Sensor management port and the Manager is 1G auto, and
they are using the correct CAT6 cables.
3 Check if the other Sensors connected to the same the Manager are also facing this issue. If yes
then it is a Manager issue.
4 Check the Sensor policy being used. If the Default Testing or Default Exclude Informational is used, the
Sensor processes more alerts and hence alert generation rate increases. Switching to Default
Prevention policy can help resolve the delay issue sometimes.
6 Check if there is any specific category of alerts, which is delayed or all the alerts are delayed. Also
check if the system events that are being raised, are also delayed.
7 Check if the alerts are seen in the Attack Log page as the alerts are restored here from the database.
This check will confirm if the issue is on the database or cache. Check the database size and if it is
very high, purge and tune the database.
8 Check the time on the Sensor and if it matches with the Manager system time. If there is any issue
with the time stamp, the Manager may show the wrong timestamp in the Attack Log page, which can
incorrectly appear as alerts being delayed.
9 Check the rate of alert generated/detected by the Sensor using the following command:
getccstats:
To check the sensor failover action (1 = Enabled, 2 = Disabled) and failover status (1 = Active,
2 = Standby, 3 = Init/Not Applicable), failover peer status (1 = Up, 2 = Down, 3 =
Incompatible, 4 = Compatible, 5 = Init/Not Applicable), fail-open status (1 = Enabled, 2 =
Disabled)
To check the count of detected alerts (signature-based, scan/recon, DoS) sent to management
port and peer Manager (in case of MDR)
To check the count of alerts sent to and received from Correlation Engine, alert correlation
counts
To check the count of alerts in ring buffer, queued to be sent to the Manager
To check ACL alerts throttling configuration status (throttling interval and threshold)
The following statistics indicate many alerts still pending in ring buffer:
AlertsInRngBufPriCount = 83621
AlertsInRngBufSecCount = 83606
PutAlertInRngBufErrCount = 6499317
The alert rate could be really high that the Manager may not be able to handle. It then introduces a
delay that is similar to backoff (with the delay reaching a max of 30 seconds per alert) and this
causes the alerts to be queued up in Ring Buffer. Once this condition is reached, the alerts delay
will increase with time. To recover, check the type of attacks and then try to create an exception
rule to filter the attack, and see if the Manager recovers.
10 Take the packet captures at the Sensor and the Manager side to identify whether the issue is at the
Sensor/Manager side or network side.
On the Manager, use Wireshark or equivalent to take packet captures on the Manager port 8502.
Using packet captures from the Sensor and the Manager, which are taken simultaneously, you can
identify if there is a delay in the Sensor sending the alert to the Manager or there is a delay in the
Manager sending the alert acknowledgment to the Sensor or is it both (pointing to a network
issue).
11 Check if Layer 7 Data Collection is enabled on the Sensor. There is a known issue when Layer 7
Data Collection is enabled, where the alerts in the Attack Log page are no longer received in real
time.
IntruDbg#> show l7dcap-usage
12 On the Manager database, use SQL queries output to check the frequency of alerts going to the
Manager. This can be done by logging into MySQL on the Manager server and executing the
following command:
a Get Sensor ID from database:
select sensor_id, name from iv_sensor;
b Input the time range for which the alert generation rate needs to be checked:
SELECT "2014-05-29 18:39:47", "2014-05-30 18:39:47" INTO @stdate, @enddate;
If the problem still persists, contact McAfee Support for further assistance.
Scenario
Connectivity issues between the Sensor and Manager.
Trust establishment does not happen between the Sensor and Manager.
Data/Information Collection
1 Execute the following commands on the Sensor:
status
show
show sbcfg
show mgmtcfg
show doscfg
show mgmtport
getccstats
show netstat
2 Collect the Manager infocollector logs. If possible, enable detailed debugging messages by
modifying <Manager_INSTALL_DIR>/config/log4j_ism.xmlfile, by adding/changing the following
lines:
<category name="iv.core.DiscoveryService"> <priority value="DEBUG"/></category>
5 Network diagram clearly mentioning where the Sensor and Manager are located.
Troubleshooting Steps
1 Check if there is any network connectivity issue such as conflicting IP address of the Sensor. This
can result in alert/pktlog channel flaps.
2 Verify that the Management Interface speed and duplex settings are configured correctly on the
Manager and Sensor and that they are hard-coded. If this fails, change one link to auto and change
the other side's duplex and speed settings until communications are established or combinations
are exhausted.
3 Ping from the Sensor to Manager and Manager to Sensor, and make sure the ping goes fine.
4 Check if the other Sensors connected to the same Manager are also facing this issue.
If yes, then it is a Manager issue.
5 Check the IP address of the system on which the Manager is installed. Make sure the correct IP
address is provided in the Sensor command set manager ip.
6 Try a deinstall and establish the trust again with the Manager.
7 Check if the Manager machine has multiple NIC cards. If yes then open below file:
<Manager_INSTALL_DIR>/bin/tms.bat
Modify the following line to assign the relevant IP address that is also used in the Sensor
configuration: set JAVA_OPTS=%JAVA_OPTS% -Dlumos.fixedManagerSNMPIPaddress=""restart
Manager
8 Check the Sensor name, which is given on the Manager while adding the Sensor using the Add New
Device wizard. Sensor name is case sensitive so make sure it exactly matches the one given on the
Manager.
9 Check that the device type is selected as IPS Sensor while adding the Sensor using Add New Device.
Selecting incorrect device type can also lead to connectivity issues.
10 Make sure that firewall is not blocking traffic between the Manager and Sensor for the following
ports :
Manager:4167 -> Sensor:8500 (UDP)
11 If using the malware policy, check if the file save option is enabled. Make sure firewall is not
blocking ports 8509 and 8510, which are used for saving malware files.
12 Check that UDP port 8500 is open and allows the Manager to Sensor SNMP communication.
13 Use the netstat -na command to verify that ports 8501 - 8505 are listening on the Manager. Click
Start | Run type cmd, press ENTER, then type netstat -na.
14 Make sure large UDP and/or fragmented UDP packets are not dropped between the Sensor and
Manager communication. This can lead to SNMP timeout. Look for the following logs in ems.log:
Ems log
******
15 Capture UDP traffic using Wireshark on the Manager. Check if the Manager is receiving UDP
response packets from the Sensor.
Sample capture on the Manager:
16 Check the time on the Sensor, and if it matches with the Manager system time.
17 Check if there are any Out Of Memory related logs in the Manager. This can lead to connectivity issues
between the Sensor and Manager.
18 Check if the Manager is an MDR pair. If yes, then verify that the IP of primary Manager in the
sensor matches the IP of the active Manager. Also check if the Sensor is treating the standby
Manager as the primary Manager or not. This may lead to connectivity issues.
If the problem still persists, contact McAfee Support for further assistance.
Troubleshooting Steps
1 Check for IP address in maxmind.com to find the geographic location for a particular IP address.
If the IP address does not match the geographic location, then it is an issue with the Manager or
the geographic database in the cloud.
2 Login to the Sensor with admin ID, and then in the Sensor CLI, type the debug command and
then enter the following command:
set loglevel mgmt (all | <0-12>) <0-15>
Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Request LastByte Offset = ffffffff
Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Attack Pkt Search Num = 1
Here geographic ID of 0 means that the Sensor does not send any geographic information for the
corresponding source or destination IP addresses.
3 Execute step 2 and wait for the IPS alert to be raised again.
This time the Sensor prints the country code sent from Sensor for the corresponding IPS alert.
If the Sensor sends the geographic location ID as 0, then its an issue with the geographic database
cloud when the Manager sends a geographic based query to find the geographic location matching
an IP address. Typically for an IPS alert, the Sensor does not send any geographic location ID
value.
If the problem still persists, contact McAfee Support for further assistance.
When a wrong country name is displayed for the source or destination IP address for an IPS alert,
then it is an issue with the Manager.
Data/Information Collection
Execute show acl stats in the Sensor CLI.
Troubleshooting Steps
Execute the show acl stats command in the Sensor CLI to fetch the following data from the
management process:
Number of ACL alerts sent by the datapath processor to the management processor
Number of ACL alerts sent from the management processor to the Manager or third party software
tool.
If there is difference between the received and sent/sent directly count by a large value but within
10,000, then the buffer to keep the ACL alerts at management processor is full. This might potentially
be the cause for the issue.
[Acl Alerts]
Received : 0
Suppressed : 0
Sent : 0
Sent Direct : 0
The buffer kept for receiving the ACL alerts from datapath processor is full, and is not flushed in an
event like ACL alert suppression disabled/enabled. In this type of scenario, if the ACL alert buffer is
not flushed, then the country name for the old ACL alert is mixed with the new ACL alert, which
results in the wrong country name in the ACL logs.
If the country name is displayed wrong in the ACL alert, for either source IP address or destination IP
address, then there is an issue with the Sensor. If you are not able to solve the problem even after
repeating the steps explained in troubleshooting, or the problem is not understood, contact McAfee
Support for further assistance.
This section describes the following aspects of using the Infocollector tool.
Contents
Introduction
How to run the InfoCollector tool
Using InfoCollector tool
Introduction
InfoCollector is an information collection tool, bundled with Manager that allows you to easily provide
McAfee with McAfee Network Security Platform-related log information. McAfee can use this
information to investigate and diagnose issues you may be experiencing with the Manager.
InfoCollector can collect information from the following sources within McAfee Network Security
Platform:
McAfee systems engineers can use the InfoCollector tool to provide you with a definition (.def) file via
email. This file is configured by McAfee to automatically choose information that McAfee needs from
your installation of Network Security Platform. You simply open the definition file within the
InfoCollector and it will automatically select the information that McAfee needs from your installation
of the Manager.
Alternatively, a manual approach can also be used with InfoCollector, and you can select information
yourself to provide to McAfee. For example, McAfee may ask you to select checkboxes that correspond
to different sets of information available within Network Security Platform.
1 If you do not already have InfoCollector installed, download the InfoCollector.zip file from the
McAfee website and extract it to a specific location in a specific drive:
Example
Example
Task
1 After you run InfoCollector, do one of the following:
If McAfee provides you with a definition file:
a After you run InfoCollector, open the File menu and click Open Definition.
b Select the definition file that McAfee sent you via email and click Select.
b Select a Duration. Select Date to specify a start and end date, or select Last X Days.
c Select the number of days from which InfoCollector should gather information.
d Click Browse and select the path and filename of the output ZIP file.
2 Click Run.
3 Provide the output ZIP file to McAfee as recommended by McAfee Technical Support. You can send
the file via email or through FTP.
The output ZIP file contains the toolconfig.txt file, which lists the information that you have chosen
to provide McAfee.
This section provides information on how the Manager Watchdog works, installing the Manager
Watchdog, starting the Manager Watchdog, using the Manager Watchdog in an MDR configuration, and
tracking the Manager Watchdog activities.
Contents
Introduction
How the Manager Watchdog works
Install the Manager Watchdog
Start the Manager Watchdog
Use the Manager Watchdog with Manager in an MDR configuration
Track the Manager Watchdog activities
Introduction
The Manager Watchdog feature is designed to restart the Manager if the Manager crashes, potentially
bringing the Manager back online before MDR enables.
The Manager Watchdog monitors the Manager process on the Manager server periodically for
availability. If Manager Watchdog detects that the Manager has gone down unexpectedly, it restarts
the service automatically. (It does not restart the Manager if the Manager has been shut down
intentionally.)
Manager Watchdog, by default, is a manual service; you must explicitly start it.
You can instead change this setting to be automatic if you wish the service to start automatically after a
system reboot.
If you have chosen to change the Manager service setting from its default (Auto) to "Manual," (during a
troubleshooting session, for example) then consider doing the same for Manager Watchdog. This will
prevent the Manager Watchdog from restarting Manager automatically.
Manager Watchdog monitors only the "Network Security PlatformMgr" service; it does not monitor
services like MySQL or Apache.
Task
1 Select Start | Settings | Control Panel. Double-click Administrative Tools, and then double-click Services.
Alternatively, you can also use the Manager icon in the Windows system tray to start or stop
Manager Watchdog. Right-click on the Manager icon at the bottom-right corner of your server and
select Start Watchdog or Stop Watchdog as required.
If the Manager Watchdog brings up a primary Manager after MDR has initiated, note that the primary
Manager does not come back Active; it checks first to determine whether the secondary is Active and
if so, remains as standby.
/<Network Security Platform install directory>/ named with the filename convention
wdout_<<time stamp>>.log
----------------------------------------------------------------------------------------------------------------------------
SERVER STDOUT: The Network Security Platform Manager Service was started successfully.
SERVER STDOUT:
SERVER STDOUT:
----------------------------------------------------------------------------------------------------------------------------
If the Manager Watchdog fails after five attempts to restart Manager, the following line appears in the
log file:
SERVER STDOUT: Failed to restart Manager after five attempts. Exiting. [kl]
The McAfee Knowledgebase (KB) contains a large number of useful articles designed to answer specific
questions that might not have been addressed elsewhere in the documentation set. We suggest
checking to see if a question you have is answered in a KB article.
The following list contains some of the more commonly accessed KB articles.
A I
about this guide 5 InfoCollector tool 107
C K
conventions and icons used in this guide 5 KnowledgeBase 115
correct identification
user sensitivity 30 M
Manager watchdog 111
D McAfee ServicePortal, accessing 6
data link errors 27
documentation S
audience for this guide 5 ServicePortal, finding product documentation 6
product-specific, finding 6 sniffer trace 27
typographical conventions and icons 5 system fault messages 33
E T
error messages 93 technical support, finding product information 6
troubleshooting tips 7
F
false positives 29, 30
false positives determination
tuning policies 29