Course Detail

Big Data Hadoop Certification Training

Course Added By - Kloudmagica (Admin)

833 - Learners
(9 - Total Reviews)  

Course Description

This Hadoop training course is designed to establish and develop you into a Certified Big Data Expert. The training provides you with a unique and practical training experience on Hadoop ecosystem and exposes you to the best practices regarding HDFS, MapReduce, Hive, Pig, and Sqoop. Once the course is fully accomplished, KloudMagica will award you with a Valid Certification that certifies you as a Big Data Expert and opens new and rich career opportunities.




Training Objectives

In order for KloudMagica to nurture you into a Big Data Expert, the course entails the following important objectives:

  • Master the concepts of HDFS and MapReduce framework and Understand Hadoop 2.x Architecture
  • Setup Hadoop Cluster and write Complex MapReduce programs
  • Learn data loading techniques using Sqoop
  • Perform data analytics using Pig, Hive and YARN
  • Implement best practices for Hadoop development
  • How will I benefit from learning Hadoop and Big data?

Big Data and Hadoop experts are quite high in demand and the market is expected to touch close to a 100 Billion by 2022. Due to the high demand, there is a shortage of experts in this field and when demand is more than supply, the prices hike and that means salaries are going to sky rocket for Big Data and Hadoop Experts.

Who would benefit the most from this course?

This training course is going to open new doors to career opportunities and success for a number of professionals such as:

  • Software Developers and Architects
  • Analytics Professionals
  • Senior IT professionals
  • Testing and Mainframe professionals
  • Data Management Professionals
  • Business Intelligence Professionals
  • Project Managers
  • Aspiring Data Scientists
  • Graduates looking to build a career in Big Data Analytics


Are there any prerequisites or prior qualifications/eligibility criteria for this course?

Not really, though basic knowledge of Java Essentials would be a plus. In order to make up for the same, KloudMagica is offering a complimentary access to “Java Essentials for Hadoop” course. Besides Java, knowledge of SQL and Linux could also be beneficial.




Module 1, session 1, Part 1
      What is Data?
      Format of data

  •         Unstructured Data

  •          Structured Data

  •          Semi-structured Data

      Units of measurement
      What is Big Data?

Module 1, Session 1, Part 2
      Evolution of Big Data
      Characteristics of Big Data

  •          Volume

  •          Velocity

  •          Variety

      Characteristics of Big Data-Revision

Module 1 session 2, Part 1
      The Additional Four V’s

  •          Veracity

  •          Variability

  •          Value

  •          Visualization

      The Additional Four V’s - Revision
      Big Data Statistics
      Challenge with Big Data
      Sources of Big Data



Module 2, Session 1, Part 1
      Traditional Approach to process the data
      Challenges in Big data
      History of Hadoop

Module 2, Session 1, Part 2
      Hadoop features
      Rack Awareness

Module 2, Session 2, Part 1
      Rack Awareness
      Where Hadoop can be used?
      Hadoop Master-Slave Architecture
      Hadoop Components

Module 2, Session 2, Part 2
      What is HDFS-DataNode
      What is HDFS-NameNode
      What is HDFS-Secondary NameNode
      Key Concepts related to HDFS
      Hadoop Cluster
      Hadoop Workflow

Module 2, Session 3
      HDFS- Write Operation
      Writing to HDFS - step by step
      Writing to HDFS-replication pipeline 
      Success report for HDFS pipeline write
      HDFS multi-block replication pipeline
      Revision-what we have learnt in HDFS- write operation
      Re-replicating Missing Replicas
      Client read- from HDFS        
      Data node read - from HDFS
      Mechanics of an HDFS delete



Module 3, Session 1, Part 1
      MapReduce Key Concept
      MapReduce Job Life Cycle
      MapReduce Phase 

Module 3, Session 1, Part 2
      MapReduce Phase – Description

Module 3, Session 2, Part 1
      MapReduce Phase – Description

Module 3, Session 2, Part 2
      MapReduce Example
      Unbalanced Cluster
      Cluster Balancing

Module 3, Session 3, Part 1
      Block size in HDFS
      Estimate Hadoop storage 
      Estimate the number of data nodes

Module 3, Session 3, Part 2
      Apache Hadoop Ecosystem



Module 4, Session 1
      Hadoop History
      Hadoop 1.X
      Hadoop 1.X –Limitations

Module 4, Session 2
      YARN Components
      Container x
      Resource Manager

Module 4, Session 3
      YARN Request Flow
      HDFS High Availability
      HDFS Federation

Module 4, Session 4
      Failover and Fencing    
         Task Failure
         ApplicationMaster failure
         Node Manager Failure
         Resource manager failure



Module 5, Session 1, Part 1
      Scheduler Options
      Overview of Capacity and FAIR Scheduler
      Capacity Scheduler

  •          Enabling Capacity Scheduler
  •          Configuring Capacity Guarantees
  •          Enforcing Capacity Limits

Module 5, Session 1, Part 2
      Fair Scheduler

  •          Fair Scheduler Configuration
  •          Determine Dominant Resource  Share in drf Policy




Module 6, Session 1
      Hadoop Configuration Files

  •          Hadoop Default Configuration Files
  •          Hadoop Site-specific Configuration Files


Module 6, Session 2

Module 6, Session 3
      Minimum and maximum allocation unit in YARN
      Memory Allocations in YARN
      Configuring Hadoop Daemons
      Daemon Configuration Variables
      Precedence of Hadoop Configuration Files


    Module 7, Session 1

      Input Format

  •          Text Input Format
  •          Key Value Text Input Format
  •          NLine Input Format
  •          Sequence File Input Format 

      Multiple Inputs
      Output Formats

  •          Text Output Format 
  •          Sequence File Output Format 
  •          Sequence File As Binary Output Format

      key characteristics of the key and value classes
      Data Types-

  •          Writable & Writable Comparable interfaces
  •          Primitive Writable Classes

      Array Writable Classes
      Null Writable and Text
      Object Writable and Generic Writable



Module 8, Session 1
      Ubuntu Installation

Module 8, Session 2
      Eclipse Installation

Module 8, Session 3
      Single Node Cluster Installation

Module 8, Session 4
      Multi Node Cluster Installation

Module 9, Session 1, Part 1
      Hadoop command execution

Module 9, Session 1, Part 2
      Hadoop command execution


Module 10, Session 1-part1
      WordCount Program explanation

Module 10, Session 1-part2
      WordCount Program explanation and execution

Module 10, Session 2-part1
      Difference between Old Api and new Api

Module 10, Session 2-part2
      Hadoop ChainMapper in Detail



Pig, Session 1, Part 1

      PIG- Introduction
      What is PIG
      Advantages of PIG

Pig, Session 1, Part 2

      Modes of user Interaction with PIG
      Basic parts in Pig programming language
      Pig Compilation and Execution Stages

Pig, Session 2, Part 1

      Data Types    

  •          Simple
  •          Complex

      Case Sensitivity
      Instructions for Identifier

Pig, Session 3, Part 1

      SQL to Pig
      Relational Operators

Pig, Session 3, Part 2

      SQL to Pig
      Relational Operators

Pig, Session 3, Part 3

      SQL to Pig
      Relational Operators

Pig, Session 4, Part 1

      Relational Operators

Pig, Session 4, Part 2

      Relational Operators

Pig, Session 4, Part 3

      Relational Operators

Pig, Session 4, Part 4

      Relational Operators

Pig, Session 5, Part 1

      Relational Operators
      What is Pig UDF?
      Piggy Bank

Pig, Session 5, Part 2

      Basic Operators
      Relational Operators-Revision



Sqoop, Session 1

      Data Load into Hadoop and Major Issues
      What is Sqoop?
      What is Sqoop Connectors?

Sqoop, Session 2

      What is Sqoop Connectors?
      Commands and Syntax for Sqoop

Sqoop, Session 3

      Commands and Syntax for Sqoop

Sqoop, Session 4

      Commands and Syntax for Sqoop

Sqoop, Session 5

      Commands and Syntax for Sqoop

Sqoop, Session 6

      Commands and Syntax for Sqoop

Sqoop, Session 7

      Commands and Syntax for Sqoop



Hive, Session 1

      What is HIVE?
      HIVE query language capabilities
      Where HIVE is not useful?
      Difference between HIVE and rdbms
      Difference between HIVE and pig

Hive, Session 2

      HIVE architecture

Hive, Session 3

      Components in HIVE architecture
      HIVE architecture- query flow
      Data types

Hive, Session 4

      HIVE reads and writes records
      Compression formats and codecs
      Encoding methods

Hive, Session 5


Hive, Session 5

      HIVE bucketed tables


What are the payment options?

Payments are accepted through Credit Card, Debit Card and Net Banking through the acceptable payment gateways on our website.

Is Java a pre-requisite to learn Big Data and Hadoop?

No there are no pre-requisites to learning Hadoop, though prior knowledge of Core Java and SQL may come in handy.

Can I Install Hadoop on Mac?

Yes, Hadoop comes with a compatibility feature for macOS. Detailed instruction will be provided in the manual.

Can I Install Hadoop on Windows?

Absolutely, Hadoop is completely compatible with all the latest version of Microsoft Windows. You’ll need to install Oracle Virtual Box on your system and then you can import KloudMagica Virtual Machine in it, which will be provided to you.

What are the recommended system requirements to install Hadoop?

Generally, any machine with 4GB of Ram and a processor better than core 2 duo will be able to aptly handle the Hadoop environment.

Is there a minimum bandwidth to access self paced courses?

No there is no minimum bandwidth as such, but a speed of 1 mbps or higher is recommended for clear video & audio and uninterrupted service.

Certification Process

Once the course and the certification quiz has been successfully completed, KloudMagica will award a Certificate of completion to the respective student.

(Course Completed 2%)
MapReduce-Session-1.1.1 (What is Data + Format of data)
MapReduce-Session-1.1.2 (Evolution of Big Data + Characteristics of Big Data(3V))
MapReduce-Session-1.2.1 (The Additional four V’s(Veracity, Variability, Value and Visualization))
MapReduce-Session-2.1.1 (Traditional Approach to process the data + Challenges in Big data)
MapReduce-Session-2.1.2 (Hadoop features + Rack Awareness)
MapReduce-Session-2.2.1 (Rack Awareness + Hadoop Master-Slave Architecture)
MapReduce-Session-2.2.2 (DataNode + NameNode + Secondary NameNode + Hadoop Cluster)
MapReduce-Session-2.3.1 (HDFS- Write Operation + Client read- from HDFS + HDFS delete)
MapReduce-Session-3.1.1 (MapReduce + MapReduce Job Life Cycle)
MapReduce-Session-3.1.2 (Mapreduce Phase – Description)
MapReduce-Session-3.2.1 (Mapreduce Phase – Description(continued))
MapReduce-Session-3.2.2 (Mapreduce Example + Unbalanced Cluster + Cluster Balancing)
MapReduce-Session-3.3.1 ( Block size in HDFS + Estimate Hadoop storage)
MapReduce-Session-3.3.2 (Apache Hadoop Ecosystem)
MapReduce-Session-4.1.1 (Hadoop History + Hadoop 1.X + Hadoop 1.X –Limitations)
MapReduce-Session-4.2.1 (YARN)
MapReduce-Session-4.3.1 (YARN Request Flow + High Availability + HDFS Federation)
MapReduce-Session-4.4 (Failover and Fencing + Failures)
MapReduce-Session-5.1.1 (Overview of Capacity and FAIR Scheduler)
MapReduce-Session-5.1.2 (Fair Scheduler)
MapReduce-Session-6.1.1 (Hadoop Configuration Files)
MapReduce-Session-6.2.1 (hdfs-*.xml + mapred-site.xml)
MapReduce-Session-6.3 (yarn-site.xml + Precedence of Hadoop Configuration Files)
MapReduce-Session-7 (Input/Output Format + Serialization)
MapReduce-Session-8.1 (Ubunto Installation)
MapReduce-Session-8.2 (Eclipse Installation)
MapReduce-Session-9.1 (Hadoop command execution)
MapReduce-Session-9.2 (Hadoop command execution)
MapReduce-Session-10.1.1 (WordCount Program explanation)
MapReduce-Session-10.1.2 (WordCount Program explanation and execution)
MapReduce-Session-10.2.1 (Difference between Old Api and new Api)
MapReduce-Session-10.2.2 (Hadoop ChainMapper in detail)
Hive-Session-1.1 (What is HIVE + HIVE query language capabilities)
Hive-Session-1.2 (difference between HIVE and rdbms)
Hive-Session-1.3 (difference between HIVE and pig)
Hive-Session-2.1 (HIVE architecture)
Hive-Session-3.1 (Components in HIVE architecture + HIVE architecture- query flow)
Hive-Session-3.2 (Data types)
Hive-Session-4.1 (Joins)
Hive-Session-4.2 (HIVE reads and writes records + Compression formats and codecs)
Hive-Session-4.3 (Encoding methods)
Hive-Session-5.1 (Partitioning)
Hive-Session-6.1 (HIVE bucketed tables + View)
Pig-Session-1.1 (PIG- Introduction + What is PIG +Advantages of PIG)
Pig-Session-3.1 (SQL to Pig + Relational Operators)
Pig-Session-3.2 (SQL to Pig + Relational Operators)
Pig-Session-3.3 (SQL to Pig + Relational Operators)
Pig-Session-4.1 (Relational Operators)
Pig-Session-4.2 (Relational Operators)
Pig-Session-4.3 (Relational Operators)
Pig-Session-4.4 (Relational Operators)
Pig-Session-5.1 (Relational Operators +What is Pig UDF + Piggy Bank)
Pig-Session-5.2 (Basic Operators + Functions + Relational Operators-Revision)
Sqoop-Session-1 (Data Load into Hadoop and Major Issues + Sqoop Connectors?)
Sqoop-Session-2 (What is Sqoop Connectors?+ Commands and Syntax for Sqoop)
Sqoop-Session-3 (Commands and Syntax for Sqoop)
Sqoop-Session-4 (Commands and Syntax for Sqoop)
Sqoop-Session-5 (Commands and Syntax for Sqoop)
Sqoop-Session-6 (Commands and Syntax for Sqoop)
Sqoop-Session-7 (Commands and Syntax for Sqoop)
Pig-Session-1.2 (Modes of user Interaction with PIG + Pig Compilation and Execution Stages)
Pig-Session-2 (Data Types )

Course Reviews (8 - Total Reviews)

  • Sneha Agarwal

    Excellent course.. explained with very simple example. Completed the course in 4 days. I hope I am ready for the interview many examples with details about problem and solution for each.I am happy and confident now.
  • risha khullar

    The video is so much informative and simplified. No words everything is awesome... Superb video. Cleared each and every concept.? Only complain in low voice but with good laptop and earphone that is not an issue. At this price its a great deal. Quality is too high with respect to price. Because I did not find much difference between Kloudmagica course and others who are charging more than 2000 usd
  • sinha khushi

    The video is very well designed and easy to understand.The Hadoop online training videos by kloudmagica are very energizing
  • Piyush Das

    Hadoop online training with kloudmagica is really very nice they gives you complete information of Hadoop in a different manner..Which makes you understand the things easily..?
  • Sumit Mukarjee

    I found the tutorial very useful, with step-by-step explanations. The course material manages to cover a large spectrum of aspects, both architectural and operational, in dense short lesson……….. Very succinct, to the point and perfectly scripted education !thnks to kloudmagica..
  • Taranjeet Kaur

    The course content seems to be pretty exhaustive and excellent.?the material was quite great and resourceful. Hadoop proved to be quite an easy task after all...?Content is well segregated into different lessons and allowed me to progress smoothly..
  • Shailesh Bhardwaj

    the material was quite great and resourceful.Content is well segregated into different lessons and allowed me to progress smoothly..
  • Richa Das

    Very well explained in a story format unlike many others who just beat around the bush.

No Announcement For Now.

Recorded Sessions for Self-Paced Learning

35+ hours of self-paced videos

Practical Exercises

In order to let you gauge your learning, each class is followed by a practical exercise that is to be completed before the next class.

One Time Payment - Life-time Access

You only have to pay once for a course and you’ll get unobstructed life-time access to the course material including all videos.


Post successful completion of the course including the certification projects/quiz, KloudMagica will award you a Certification that validates you as a Big Data & Hadoop Expert.