You are on page 1of 9

Phone: +91 7680813158

Email: contact@a1trainings.com

Hadoop Course Content


During this course, you will learn:
Introduction to Big Data and Hadoop
Hadoop ecosystem - Concepts
Hadoop Map-reduce concepts and features
Developing the map-reduce Applications
Pig concepts
Hive concepts
Oozie workflow concepts
Flume Concepts
Hue Concepts
HBASE Concepts
Real Life Use Cases

Virtual box/VM Ware

Basics
Installations
Backups
Snapshots

Linux
Basics
Installations
Commands

Hadoop
Why Hadoop?
Scaling
Distributed Framework
Hadoop v/s RDBMS
Brief history of hadoop

Setup hadoop
Pseudo mode
Cluster mode
Ipv6
Ssh

Installation of java, hadoop


Configurations of hadoop
Hadoop Processes ( NN, SNN, JT, DN, TT)
Temporary directory
UI
Common errors when running hadoop cluster, solutions

HDFS- Hadoop distributed File System


HDFS Design and Architecture
HDFS Concepts
Interacting HDFS using command line
Interacting HDFS using Java APIs
Dataflow
Blocks
Replica

Hadoop Processes
Name node
Secondary name node
Job tracker
Task tracker
Data node

Map Reduce
Developing Map Reduce Application
Phases in Map Reduce Framework
Map Reduce Input and Output Formats
Advanced Concepts
Sample Applications
Combiner

Joining datasets in Mapreduce jobs


Map-side join
Reduce-Side join

Map reduce customization


Custom Input format class
Hash Partitioner
Custom Partitioner
Sorting techniques
Custom Output format class

Hadoop Programming Languages :I).HIVE


Introduction

Installation and Configuration


Interacting HDFS using HIVE
Map Reduce Programs through HIVE
HIVE Commands
Loading, Filtering, Grouping.
Data types, Operators..
Joins, Groups.
Sample programs in HIVE

II).PIG
Basics
Installation and Configurations
Commands.
OVERVIEW HADOOP DEVELOPER

Introduction
The Motivation for Hadoop
Problems with traditional large-scale systems
Requirements for a new approach

Hadoop: Basic Concepts


Map-side join

Reduce-Side join

Introduction
An Overview of Hadoop
The Hadoop Distributed File System
Hands-On Exercise
How MapReduce Works
Hands-On Exercise
Anatomy of a Hadoop Cluster
Other Hadoop Ecosystem Components

Writing a MapReduce Program


The MapReduce Flow
Examining a Sample MapReduce Program
Basic MapReduce API Concepts
The Driver Code
The Mapper
The Reducer
Hadoops Streaming API
Using Eclipse for Rapid Development
Hands-on exercise
The New MapReduce API

Common MapReduce Algorithms


Sorting and Searching
Indexing
Machine Learning With Mahout
Term Frequency Inverse Document Frequency
Word Co-Occurrence
Hands-On Exercise.

PIG Concepts..
Data loading in PIG.
Data Extraction in PIG.
Data Transformation in PIG.
Hands on exercise on PIG.

Hive Concepts.
Hive Query Language.
Alter and Delete in Hive.
Partition in Hive.
Indexing.
Joins in Hive.Unions in hive.
Industry specific configuration of hive parameters.

Authentication & Authorization.


Statistics with Hive.
Archiving in Hive.
Hands-on exercise

Working with Sqoop


Introduction.
Import Data.
Export Data.
Sqoop Syntaxs.
Databases connection.
Hands-on exercise

Working with Flume-------------------------------------02


Hours
Introduction.
Configuration and Setup.
Flume Sink with example.
Channel.
Flume Source with example.
Complex flume architecture.
OOZIE Concepts

IMPALA Concepts
HUE Concepts

Reporting Tool:
Tableau Software
1.Tableau Fundamentals.
2.Tableau Analytics.
3.Visual Analytics.
4. Hands-on exercise

You might also like