Professional Documents
Culture Documents
com/site/announcements/1171973)X
.
What does log message "entering GATHER state" mean in Red Hat
High Availability Add-on?
( Updated 20/05/2014
0 Issue
In the event of a cluster membership change, the cluster enters into a GATHER state. The logs will report messages similar to the following:
What does this messages mean in Red Hat High Availability Cluster?
Environment
Red Hat Enterprise Linux Server(RHEL) 5 with High Availability or Resilient Storage Add-on
Red Hat Enterprise Linux Server(RHEL) 6 with High Availability or Resilient Storage Add-on
, Resolution
When nodes in a cluster enter the GATHER state, they send join messages out to rest of the cluster in order to form a consensus about the cluster
membership. These messages can be interpreted as follows:
NOTE: These states are all related. The Token timer is set when the token is transmitted and if it expires
before another message is received it will trigger one of these messages, depending on the state of
the protocol at the time.
7: mcast (data) message received from unknown node while in OPERATIONAL state
8: mcast (data) message received from unknown node while in GATHER state
Self-explanatory I think. This can be caused by a brief network split where
a node is forced to leave the cluster but doesn't get fenced before the network
heals again.
Root Cause
The GATHER state message is normally caused by a network/communication issue within the cluster. But GATHER states can be entered for a number of
reasons. The number at the end of the message (from X) indicates why it entered the GATHER state. This is called by
"message_handler_memb_merge_detect" when the cluster is attempting to see if there are other nodes are out on the network.
GATHER state happens every time a node receives its own token back (meaning its the only node in the ring). During this time, it starts a timer to form
and agree on a membership list of nodes in the cluster. If this timer expires, we enter the GATHER state to see if there is another node out there, and
attempt to merge with it. After a certain number of times after the node receives its our own token back, it will stop sending it. In which case, these
state changes will also stop. Therefore, they are a side effect of the earlier communication problem and subsequent fencing that left this node alone
in the cluster.
Product(s) Red Hat Enterprise Linux Component cluster cman openais Category Learn more
Tags cluster cluster ha high availability cluster_suite high availalility add-on syslog
Comments
Copyright 2014 Red Hat, Inc.