Professional Documents
Culture Documents
Environment Setup:
No.of Nodes = 4 (1 Namenode, 3 Datanodes)
Hostnames:
Namenode – namenode
Datanodes – datanode1, datanode2, datanode3
2. Edit the “/etc/hosts” file providing the IP addresses of the cluster nodes.
~$ sudo vim /etc/hosts
namenode- ip-address namenode
datanode1-ip-address datanode1
datanode2-ip-address datanode2
datanode3-ip-address datanode3
5. Download Java1.7 JDK tarball. Consider the architecture 32 bit (i386, i586,
i686), 64bit (x86_64) before downloading.
6. Assuming that the downloaded tarballs are present under the home directory of
the user. Extract the tarballs
8. For these variables to be set for the current shell, source the file.
~$ source ~/.bashrc
Check whether the changes have been applied properly
~$ echo $JAVA_HOME
~$ hadoop version
~hadoop-2.7.1/etc/hadoop$ vi hadoop-env.sh
export JAVA_HOME=/home/multinode/jdk1.7.0_79
~hadoop-2.7.1/etc/hadoop$ vi core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://namenode:8020</value>
</property>
</configuration>
~hadoop-2.7.1/etc/hadoop$ vi hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/multinode/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/multinode/data</value>
</property>
<property>
<name>dfs.namenode.http.address</name>
<value>namenode:50070</value>
</property>
</configuration>
~hadoop-2.7.1/etc/hadoop$ vi yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>namenode</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
~hadoop-2.7.1/etc/hadoop$ vi mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
Note: These configurations are designed to work for the scenario of running
Namenode, ResourceManager and JobHistoryServer daemons in the “namenode”.
~hadoop-2.7.1/etc/hadoop$ vi slaves
datanode1
datanode2
datanode3
In “datanodes”:
Repeat the steps from 1 to 9 on all datanodes.
10. To enable password less login from namenode to all datanodes through SSH
In “namenode”:
~$ ssh-keygen
~$ ssh-copy-id -i ~/.ssh/id_rsa.pub namenode
~$ ssh-copy-id -i ~/.ssh/id_rsa.pub datanode1
~$ ssh-copy-id -i ~/.ssh/id_rsa.pub datanode2
~$ ssh-copy-id -i ~/.ssh/id_rsa.pub datanode3
This procedure avoids prompting for password, when starting the daemons.
This formats the dfs.namenode.name.dir location and creates the necessary files
and folders required for namenode.
13. To check for the daemons, use jps (java process status)
~$ jps