Course Detail

Hadoop MapReduce

Course Added By - Kloudmagica (Admin)

46 - Learners
(1 - Total Reviews)  

Course Description

This MapReduce  training course is designed to establish and develop you into a MapReduce Expert. The training provides you with a unique and practical training experience on Hadoop MapReduce. Once the course is fully accomplished, KloudMagica will award you with a Valid Certification that certifies you as a MapReduce Expert and opens new and rich career opportunities.



Training Objectives

In order for KloudMagica to nurture you into a MapReduce , the course entails the following important objectives:

  • Master the concepts of HDFS and MapReduce framework and Understand Hadoop 2.x Architecture
  • Setup Hadoop Cluster and write Complex MapReduce programs

How will I benefit from learning Hadoop MapReduce ?

Big Data and Hadoop experts are quite high in demand and the market is expected to touch close to a 100 Billion by 2022. Due to the high demand, there is a shortage of experts in this field and when demand is more than supply, the prices hike and that means salaries are going to sky rocket for Big Data and Hadoop Experts.

Who would benefit the most from this course?

This training course is going to open new doors to career opportunities and success for a number of professionals such as:

  • Software Developers and Architects
  • Analytics Professionals
  • Senior IT professionals
  • Testing and Mainframe professionals
  • Data Management Professionals
  • Business Intelligence Professionals
  • Project Managers
  • Aspiring Data Scientists
  • Graduates looking to build a career in Big Data Analytics


Are there any prerequisites or prior qualifications/eligibility criteria for this course?

Not really, though basic knowledge of Java Essentials would be a plus. In order to make up for the same, KloudMagica is offering a complimentary access to “Java Essentials for Hadoop” course. Besides Java, knowledge of SQL and Linux could also be beneficial.




Module 1, session 1, Part 1
      What is Data?
      Format of data

  •         Unstructured Data

  •          Structured Data

  •          Semi-structured Data

      Units of measurement
      What is Big Data?

Module 1, Session 1, Part 2
      Evolution of Big Data
      Characteristics of Big Data

  •          Volume

  •          Velocity

  •          Variety

      Characteristics of Big Data-Revision

Module 1 session 2, Part 1
      The Additional Four V’s

  •          Veracity

  •          Variability

  •          Value

  •          Visualization

      The Additional Four V’s - Revision
      Big Data Statistics
      Challenge with Big Data
      Sources of Big Data



Module 2, Session 1, Part 1
      Traditional Approach to process the data
      Challenges in Big data
      History of Hadoop

Module 2, Session 1, Part 2
      Hadoop features
      Rack Awareness

Module 2, Session 2, Part 1
      Rack Awareness
      Where Hadoop can be used?
      Hadoop Master-Slave Architecture
      Hadoop Components

Module 2, Session 2, Part 2
      What is HDFS-DataNode
      What is HDFS-NameNode
      What is HDFS-Secondary NameNode
      Key Concepts related to HDFS
      Hadoop Cluster
      Hadoop Workflow

Module 2, Session 3
      HDFS- Write Operation
      Writing to HDFS - step by step
      Writing to HDFS-replication pipeline 
      Success report for HDFS pipeline write
      HDFS multi-block replication pipeline
      Revision-what we have learnt in HDFS- write operation
      Re-replicating Missing Replicas
      Client read- from HDFS        
      Data node read - from HDFS
      Mechanics of an HDFS delete



Module 3, Session 1, Part 1
      MapReduce Key Concept
      MapReduce Job Life Cycle
      MapReduce Phase 

Module 3, Session 1, Part 2
      MapReduce Phase – Description

Module 3, Session 2, Part 1
      MapReduce Phase – Description

Module 3, Session 2, Part 2
      MapReduce Example
      Unbalanced Cluster
      Cluster Balancing

Module 3, Session 3, Part 1
      Block size in HDFS
      Estimate Hadoop storage 
      Estimate the number of data nodes

Module 3, Session 3, Part 2
      Apache Hadoop Ecosystem



Module 4, Session 1
      Hadoop History
      Hadoop 1.X
      Hadoop 1.X –Limitations

Module 4, Session 2
      YARN Components
      Container x
      Resource Manager

Module 4, Session 3
      YARN Request Flow
      HDFS High Availability
      HDFS Federation

Module 4, Session 4
      Failover and Fencing    
         Task Failure
         ApplicationMaster failure
         Node Manager Failure
         Resource manager failure



Module 5, Session 1, Part 1
      Scheduler Options
      Overview of Capacity and FAIR Scheduler
      Capacity Scheduler

  •          Enabling Capacity Scheduler
  •          Configuring Capacity Guarantees
  •          Enforcing Capacity Limits

Module 5, Session 1, Part 2
      Fair Scheduler

  •          Fair Scheduler Configuration
  •          Determine Dominant Resource  Share in drf Policy




Module 6, Session 1
      Hadoop Configuration Files

  •          Hadoop Default Configuration Files
  •          Hadoop Site-specific Configuration Files


Module 6, Session 2

Module 6, Session 3
      Minimum and maximum allocation unit in YARN
      Memory Allocations in YARN
      Configuring Hadoop Daemons
      Daemon Configuration Variables
      Precedence of Hadoop Configuration Files


    Module 7, Session 1

      Input Format

  •          Text Input Format
  •          Key Value Text Input Format
  •          NLine Input Format
  •          Sequence File Input Format 

      Multiple Inputs
      Output Formats

  •          Text Output Format 
  •          Sequence File Output Format 
  •          Sequence File As Binary Output Format

      key characteristics of the key and value classes
      Data Types-

  •          Writable & Writable Comparable interfaces
  •          Primitive Writable Classes

      Array Writable Classes
      Null Writable and Text
      Object Writable and Generic Writable



Module 8, Session 1
      Ubuntu Installation

Module 8, Session 2
      Eclipse Installation

Module 8, Session 3
      Single Node Cluster Installation

Module 8, Session 4
      Multi Node Cluster Installation

Module 9, Session 1, Part 1
      Hadoop command execution

Module 9, Session 1, Part 2
      Hadoop command execution


Module 10, Session 1-part1
      WordCount Program explanation

Module 10, Session 1-part2
      WordCount Program explanation and execution

Module 10, Session 2-part1
      Difference between Old Api and new Api

Module 10, Session 2-part2
      Hadoop ChainMapper in Detail

what is map data reduce

Test answer

(Course Completed 0%)
MapReduce-Session-1.1.1 (What is Data + Format of data)
MapReduce-Session-1.1.2 (Evolution of Big Data + Characteristics of Big Data(3V))
MapReduce-Session-1.2.1 (The Additional four V’s(Veracity, Variability, Value and Visualization))
MapReduce-Session-2.1.1 (Traditional Approach to process the data + Challenges in Big data)
MapReduce-Session-2.1.2 (Hadoop features + Rack Awareness)
MapReduce-Session-2.2.1 (Rack Awareness + Hadoop Master-Slave Architecture)
MapReduce-Session-2.2.2 (DataNode + NameNode + Secondary NameNode + Hadoop Cluster)
MapReduce-Session-2.3.1 (HDFS-Read, Write and Delete Operation)
MapReduce-Session-3.1.1 (MapReduce + MapReduce Job Life Cycle)
MapReduce-Session-3.1.2 (Mapreduce Phase – Description)
MapReduce-Session-3.2.1 (Mapreduce Phase – Description)
MapReduce-Session-3.2.2 (Mapreduce Example + Unbalanced Cluster + Cluster Balancing)
MapReduce-Session-3.3.1 ( Block size in HDFS + Estimate Hadoop storage)
MapReduce-Session-3.3.2 (Apache Hadoop Ecosystem)
MapReduce-Session-4.1.1 (Hadoop History + Hadoop 1.X + Hadoop 1.X –Limitations)
MapReduce-Session-4.2.1 (YARN)
MapReduce-Session-4.3.1 (YARN Request Flow + High Availability + HDFS Federation)
MapReduce-Session-4.4.1 (Failover and Fencing + Failures)
MapReduce-Session-5.1.1 (Overview of Capacity and FAIR Scheduler)
MapReduce-Session-5.1.2 (Fair Scheduler)
MapReduce-Session-6.1.1 (Hadoop Configuration Files)
MapReduce-Session-6.2.1 (hdfs-*.xml + mapred-site.xml)
MapReduce-Session-6.3 (yarn-site.xml + Precedence of Hadoop Configuration Files)
MapReduce-Session-7 (Input/Output Format + Serialization)
MapReduce-Session-8.1 (Ubunto Installation)
MapReduce-Session-8.2 (Eclipse Installation)
MapReduce-Session-9.1 (Hadoop command execution)
MapReduce-Session-9.2 (Hadoop command execution)
MapReduce-Session-10.1.1 (WordCount Program explanation)
MapReduce-Session-10.1.2 (WordCount Program explanation and execution)
MapReduce-Session-10.2.1 (Difference between Old Api and new Api)
MapReduce-Session-10.2.2 (Hadoop ChainMapper in detail)

No Review Found

No Announcement For Now.

No Feature For Now.