Professional Documents
Culture Documents
_
Stack Overflow is a community of 4.7 Join the Stack Overflow community to:
million programmers, just like you,
helping each other.
The namenode in the hadoop architecture is a single point of failure. How do people who have large hadoop clusters cope with this
problem. Is there an industry-accepted solution that has worked well wherein a secondary namenode takes over in case the primary one
fails ?
hadoop mapreduce
3 Answers
Yahoo has certain recommendations for configuration settings at different cluster sizes to take
NameNode failure into account. For example:
The single point of failure in a Hadoop cluster is the NameNode. While the loss of any
other machine (intermittently or permanently) does not result in data loss, NameNode loss
results in cluster unavailability. The permanent loss of NameNode data would render the
cluster's HDFS inoperable.
Therefore, another step should be taken in this configuration to back up the NameNode
metadata
Facebook uses a tweaked version of Hadoop for its data warehouses; it has some
optimizations that focus on NameNode reliability. Additionally to the patches available on
github, Facebook appears to use AvatarNode specifically for quickly switching between
primary and secondary NameNodes. Dhruba Borthakur's blog contains several other entries
offering further insights into the NameNode as a single point of failure.
1 of 2 1/9/2016 2:11 PM
mapreduce - hadoop namenode single point of failure - Stack Overflow http://stackoverflow.com/questions/4502275/hadoop-namenode-single-poi...
Large Hadoop clusters have thousands of data nodes and one name node. The probability of
failure goes up linearly with machine count (all else being equal). So if Hadoop didn't cope with
data node failures it wouldn't scale. Since there's still only one name node the Single Point of
Failure (SPOF) is there, but the probability of failure is still low.
That sad, Bkkbrad's answer about Facebook adding failover capability to the name node is
right on.
2 of 2 1/9/2016 2:11 PM