Welcome to Scribd!

R Integration Hadoop On Ubuntu

Uploaded by

0% found this document useful (0 votes)

136 views18 pages

This document provides instructions for integrating R and Hadoop on Ubuntu 12.04. It recommends installing Oracle Java 6 and configuring environment variables before installing RHadoop packages like rmr2, rhdfs, and rhbase. The instructions explain how to update sources, install R and required packages, and configure rJava before using rmr2's mapreduce function to run a simple example that doubles numbers in parallel on Hadoop.

Original Description:

R integration

Original Title

R Integration Hadoop on Ubuntu

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

136 views18 pages

R Integration Hadoop On Ubuntu

Uploaded by

Tavpritesh Sethi

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 18

Search inside document

R-Hadoop Integration on Ubuntu:This manual is direct for R and Hadoop integration on Ubuntu 12.

04 Pre-requisites:We assume, that the user would have following up and running before starting R and Hadoop integration
Ubuntu 12.04

Hadoop 1.x + If you do not have the Hadoop preinstalled on your Ubuntu machine, please follow the Single-node-cluster-(pseudo-distributed-mode-cluster.pdf guide present in your LMS under Module-7, to set-up the environment for R integration with Hadoop. Once Hadoop installation is done, make sure that all the processes are running:

Note: R integration with Hadoop has issues when it comes to java-openjdk. To resolve it, we need to have oracle-java6 installed on the machine. To install oracle-java6 please follow the following steps:

Give the command: sudo apt-get update

Click Yes to accept the agreement.

Edit the .bashrc file:

# Set Hadoop-related environment variables export CONF=/home/user/hadoop-1.2.0/conf # Set JAVA_HOME export JAVA_HOME=/usr/lib/jvm/java-6-oracle # Add Hadoop bin/ directory to PATH export PATH=$PATH:$/home/user/hadoop-1.2.0/bin Note: Please add the exact location of the specified files from your system.

Make sure JAVA_HOME is set to the correct java location.

Installing RHadoop
RHadoop has mainly following three R packages: rmr2 rhdfs rhbase rmr2 package provides Hadoop MapReduce functionality in R, rhdfs provides HDFS file operations in R and rhbase provides HBase connectivity from R.

Step #1: Update the sources.list.

sudo gedit /etc/apt/sources.list

Adding the line:

deb http://cran.cnr.berkeley.edu/bin/linux/ubuntu/ precise/

Step #2: sudo apt-get update

Step #3: Install r-base package.

sudo apt-get install r-base

Checking the version of R:

Download the following packages from: http://cran.cnr.berkeley.edu/ bitops rhdfs digest rJava functional RJSONIO plyr rmr2 Rcpp stringr reshape2 The installation requires the corresponding tar.gz archives to be downloaded. If the downloaded files are in Downloads, give the following command:

To untar the zipped file:

Then we can run R CMD INSTALL command with sudo privileges.

Rcpp Package

RJSONIO Package

digest Package

functional package

stringr package

plyr package

bitops package

reshape2 package

rmr2 package

Before installing rJava package we need to follow the following steps:

sudo JAVA_HOME=/usr/lib/jvm/java-6-oracle/jre R CMD javareconf

rJava package
sudo R CMD INSTALL rJava rJava_0.9-3.tar.gz

sudo HADOOP_CMD=/home/istvan/hadoop/bin/hadoop R CMD INSTALL rhdfs rhdfs_1.0.5.tar.gz

Make sure that the following packages are installed:

Getting started with RHadoop

In principle, RHadoop MapReduce is a similar operation to R lapply function that applies a function over a list or vector.

Without mapreduce function we could write a simple R code to double all the numbers from 1 to 100:
> ints = 1:100 > doubleInts = sapply(ints, function(x) 2*x) > head(doubleInts) [1] 2 4 6 8 10 12

With RHadoop rmr package we could use mapreduce function to implement the same calculations see doubleInts.R script:

Sys.setenv(HADOOP_HOME="/home/vikas/hadoop") Sys.setenv(HADOOP_CMD="/home/vikas/hadoop/bin/hadoop") library(rmr2) library(rhdfs) ints = to.dfs(1:100) calc = mapreduce(input = ints, map = function(k, v) cbind(v, 2*v)) from.dfs(calc) $val

Essential Linux Toolkit
Document37 pages
Essential Linux Toolkit
studio wader
100% (1)
Ubuntu Cheat Sheet PDF
Document4 pages
Ubuntu Cheat Sheet PDF
Vu Nguyen Hoang
No ratings yet
20 Things To Do After Installing Kali Li
Document19 pages
20 Things To Do After Installing Kali Li
Hiển Kim Phạm
100% (1)
13.1 Install and Update Software Packages
Document3 pages
13.1 Install and Update Software Packages
amit_post2000
No ratings yet
Terminal Cheatsheet
Document2 pages
Terminal Cheatsheet
Choju
No ratings yet
NDG Linux Essentials 2.20 Midterm Exam (Modules 1-9) Answers - InfraExam 2022
Document6 pages
NDG Linux Essentials 2.20 Midterm Exam (Modules 1-9) Answers - InfraExam 2022
arunshan
No ratings yet
Hadoop Singlenode
Document43 pages
Hadoop Singlenode
ir
No ratings yet
Linux Admin
Document44 pages
Linux Admin
chaits258
No ratings yet
RPM
Document22 pages
RPM
Reshma vembenickal
No ratings yet
Hadoop Installation Manual 2.odt
Document20 pages
Hadoop Installation Manual 2.odt
Gurasees Singh
No ratings yet
Iot Lab Manual
Document26 pages
Iot Lab Manual
niharika
0% (1)
Bda Lab Manual
Document45 pages
Bda Lab Manual
reenadh shaik
No ratings yet
Rhadoop
Document4 pages
Rhadoop
mail
No ratings yet
BDA Practical
Document38 pages
BDA Practical
Jatin Mathur
No ratings yet
BDA LAB Programs
Document56 pages
BDA LAB Programs
raghu rama teja vegesna
No ratings yet
VU Find UBUNTU Legacy - Installation - Ubuntu (VuFind Documentation)
Document3 pages
VU Find UBUNTU Legacy - Installation - Ubuntu (VuFind Documentation)
Farhan Ali
No ratings yet
BackTrack Wiki
Document155 pages
BackTrack Wiki
royvalentik
No ratings yet
Installing and Configuring R Studio
Document8 pages
Installing and Configuring R Studio
Rajendra Prasad
No ratings yet
RPM
Document23 pages
RPM
prateek
No ratings yet
Experiment No - 1
Document13 pages
Experiment No - 1
Tameem Ahmed
No ratings yet
Integration of R With Hadoop
Document9 pages
Integration of R With Hadoop
Venkatraman Krishnamoorthy
No ratings yet
Installation of Cassandra: Step 1 - Installing The Oracle Java Virtual Machine
Document4 pages
Installation of Cassandra: Step 1 - Installing The Oracle Java Virtual Machine
LavanyaDamodaram
No ratings yet
8000002080-Installation On Linux OS - Cloud Connector - PRD
Document4 pages
8000002080-Installation On Linux OS - Cloud Connector - PRD
Kapil Nebhnani
No ratings yet
Installation Deb Ian
Document7 pages
Installation Deb Ian
Supar Dianto
No ratings yet
Ubuntu Cheat Sheet PDF
Document4 pages
Ubuntu Cheat Sheet PDF
Vin
No ratings yet
Step 1: Verifying Java Installation: Download Scala
Document3 pages
Step 1: Verifying Java Installation: Download Scala
eabernstein
No ratings yet
Chapter - 9 Red Hat Package Manager
Document15 pages
Chapter - 9 Red Hat Package Manager
ak.microsoft20056613
No ratings yet
Linux Driver - Binary Rpm/source RPM User Guide and Known Limitation
Document11 pages
Linux Driver - Binary Rpm/source RPM User Guide and Known Limitation
Дима Чеснейший
0% (1)
Installation OpenMeetings 3.0.x On Ubuntu 14.04 PDF
Document17 pages
Installation OpenMeetings 3.0.x On Ubuntu 14.04 PDF
Fendy Chandra
No ratings yet
Installation of Hadoop On Ubuntu: 1-Install Java
Document8 pages
Installation of Hadoop On Ubuntu: 1-Install Java
Nikita Kukreja
No ratings yet
2 - Installation
Document15 pages
2 - Installation
boxbe9876
No ratings yet
How To Install Apache Server Source Package On Linux Server
Document5 pages
How To Install Apache Server Source Package On Linux Server
nagaraj
No ratings yet
Instalisasi Hadoop Dengan Ubuntu
Document17 pages
Instalisasi Hadoop Dengan Ubuntu
MIFTAHUL JANNAH SISTEM INFORMASI 2020
No ratings yet
Hadoop Installation Step by Step
Document8 pages
Hadoop Installation Step by Step
Ramkumar Gopal
No ratings yet
Discuss RPM Advantages Disadvantages and RPM Command Line
Document14 pages
Discuss RPM Advantages Disadvantages and RPM Command Line
Veenit Kumar
No ratings yet
Installing The Metasploit Framework On Linux
Document5 pages
Installing The Metasploit Framework On Linux
Alekhya
No ratings yet
Hadoop Cluster Creation
Document8 pages
Hadoop Cluster Creation
manish singh
No ratings yet
102.4. Use Debian Package Management
Document5 pages
102.4. Use Debian Package Management
Milad Norouzi
No ratings yet
RPM
Document4 pages
RPM
mahesh2013
No ratings yet
Installing Software With FreeBSD
Document66 pages
Installing Software With FreeBSD
Maisiba Bravo
No ratings yet
Linux
Document7 pages
Linux
gaccforai
No ratings yet
05 Software Installation and Service Management
Document56 pages
05 Software Installation and Service Management
Rush Ounza
No ratings yet
Install Sqoop
Document7 pages
Install Sqoop
Kajal
No ratings yet
Setup Local APT Repository in Debian 8
Document7 pages
Setup Local APT Repository in Debian 8
JOHN JAIRO ARGUELLO GODOY
No ratings yet
Installing Freepbx 13 On Ubuntu Server 14.04.2 LTS: Read First
Document7 pages
Installing Freepbx 13 On Ubuntu Server 14.04.2 LTS: Read First
Angel Jiménez
No ratings yet
Installation of Hadoop
Document8 pages
Installation of Hadoop
David Joseph
No ratings yet
Cara Install Aplikasi Dilinux
Document4 pages
Cara Install Aplikasi Dilinux
Joko Setiyono
No ratings yet
Bda Lab
Document47 pages
Bda Lab
pawan
No ratings yet
ISPCP Install Debian
Document5 pages
ISPCP Install Debian
Boicenco Adrian
No ratings yet
Enter Your Search Terms
Document8 pages
Enter Your Search Terms
Deepak Yadav
No ratings yet
Unicenta Opos: Installation Guide
Document11 pages
Unicenta Opos: Installation Guide
Daniel Pedraza
No ratings yet
NVIDIA Driver Installation Quickstart
Document14 pages
NVIDIA Driver Installation Quickstart
Bogdan Bocse
No ratings yet
How To Install JDK
Document8 pages
How To Install JDK
Nirajan Shrestha
No ratings yet
7.5. Installing New Software: 7.5.1. General
Document8 pages
7.5. Installing New Software: 7.5.1. General
munish0875
No ratings yet
Basic Usage: Logging in To Backtrack
Document3 pages
Basic Usage: Logging in To Backtrack
Zoel Beta At-Tangseiyie
No ratings yet
O o o O: Learn About Nightly and Test Channels
Document5 pages
O o o O: Learn About Nightly and Test Channels
Victor Matta
No ratings yet
VirtualX Software Deployment Guide
Document4 pages
VirtualX Software Deployment Guide
Balaji Kalidhasan
No ratings yet
Snort 3 On FreeBSD 11
Document13 pages
Snort 3 On FreeBSD 11
Olyte
No ratings yet
Install GCC 4.7 On RHEL 6
Document3 pages
Install GCC 4.7 On RHEL 6
Zhou Yupeng Paul
No ratings yet
LPI 101 - Use Red Hat Package Management (5) : (Linux Professional Institute Certification)
Document59 pages
LPI 101 - Use Red Hat Package Management (5) : (Linux Professional Institute Certification)
Desmond Devendran
No ratings yet
Install Rasperry Pi OS On Your SD Card With The Raspberry Pi Imager
Document14 pages
Install Rasperry Pi OS On Your SD Card With The Raspberry Pi Imager
jessryl
No ratings yet
Apache Spark Installation
Document4 pages
Apache Spark Installation
Harshit Sinha
No ratings yet
EDB Postgres Enterprise Edition V 13 Installation and Upgrade Guide
Document27 pages
EDB Postgres Enterprise Edition V 13 Installation and Upgrade Guide
Israel
No ratings yet
Install with Ubuntu Terminal: Free Software Literacy Series
From Everand
Install with Ubuntu Terminal: Free Software Literacy Series
Santa Rosa
No ratings yet
Evaluation of Some Windows and Linux Intrusion Detection Tools
From Everand
Evaluation of Some Windows and Linux Intrusion Detection Tools
Dr. Hedaya Alasooly
No ratings yet
Evaluation of Some Intrusion Detection and Vulnerability Assessment Tools
From Everand
Evaluation of Some Intrusion Detection and Vulnerability Assessment Tools
Dr. Hedaya Mahmood Alasooly
No ratings yet
.1 - V1 - Alcatel-Lucent 9311, 9312, 9322, 9326, 9332 Node B - Software Upgrade Procedure Using OAM 8.1 - Inter Release Upgrade
Document65 pages
.1 - V1 - Alcatel-Lucent 9311, 9312, 9322, 9326, 9332 Node B - Software Upgrade Procedure Using OAM 8.1 - Inter Release Upgrade
Amrit Aulakh
100% (1)
Scheduler Installation
Document49 pages
Scheduler Installation
Alys Alys
No ratings yet
Ucm6xxx CDR Rec Api Guide
Document28 pages
Ucm6xxx CDR Rec Api Guide
Carlos Perez
No ratings yet
NLLoc Guide PDF
Document132 pages
NLLoc Guide PDF
asri
No ratings yet
Practical Labs: Linux Kernel and Driver Development Training
Document52 pages
Practical Labs: Linux Kernel and Driver Development Training
Kanaiya Kanzaria
No ratings yet
Vcenter Hyperic 58 Installation and Configuration Guide
Document58 pages
Vcenter Hyperic 58 Installation and Configuration Guide
Nayab Rasool
No ratings yet
Linux Commands PDF Add Htop
Document1 page
Linux Commands PDF Add Htop
Altair Ibn La Ahad
No ratings yet
Unit2 HDFS
Document17 pages
Unit2 HDFS
Prince Rathore
No ratings yet
Nucleo F207ZG
Document51 pages
Nucleo F207ZG
Mani Kandan K
No ratings yet
7CS4-21 IoT Lab Manual
Document80 pages
7CS4-21 IoT Lab Manual
Gravityboi
No ratings yet
U Kbuild PDF
Document4 pages
U Kbuild PDF
Okecha Stephene
No ratings yet
QDirect User Guide
Document118 pages
QDirect User Guide
Darren DeWispelaere
No ratings yet
FieldTalk Modbus Master C++ Library
Document247 pages
FieldTalk Modbus Master C++ Library
adesa2013
No ratings yet
Docker Cheat Sheet & Reference
Document46 pages
Docker Cheat Sheet & Reference
Mostafa Mirbabaie
No ratings yet
Install OpenERP On Ubuntu
Document9 pages
Install OpenERP On Ubuntu
Quynh Nguyen
No ratings yet
TelAlert Installation Guide-V7.0
Document39 pages
TelAlert Installation Guide-V7.0
daskew
No ratings yet
Packaging ZPanel Packages
Document2 pages
Packaging ZPanel Packages
ronifebrianto
No ratings yet
EXAM 000-104: AIX 6.1 Administration
Document10 pages
EXAM 000-104: AIX 6.1 Administration
Maebou M. K. Cham
0% (1)
000 104
Document71 pages
000 104
Ashwini Tyagi
No ratings yet
HP UX General PDF
Document64 pages
HP UX General PDF
THYAGARAJAN
No ratings yet
Scripts 1
Document23 pages
Scripts 1
nitindxt
No ratings yet
Unix Scripts
Document3 pages
Unix Scripts
reach2dpg
No ratings yet
Archive Definition: Extraction
Document7 pages
Archive Definition: Extraction
MurthyThendela
No ratings yet
Unipro UGENE User Manual
Document247 pages
Unipro UGENE User Manual
gaby-01
No ratings yet