/ApacheHadoop

Exercise files for Apache Hadoop Big Data Training

Apache Hadoop Big Data Training

These are the exercise files used for Apache Hadoop Big Data Training course.

The course outline can be found in

https://www.tertiarycourses.com.sg/apache-hadoop-big-data-training.html

https://www.tertiarycourses.com.my/apache-hadoop-big-data-training-malaysia.html

Day1

Module 1: Get Started on Apache Hadoop

  • Why Hadoop?
  • Differnece between HBase and Hadoop

Module 2: Hadoop Core Components

  • Java Virutal Machine (JVM)
  • HDFS
  • Hadoop Cluster Components
  • Exploring Hadoop Platforms

Module 3: Setup Hadoop Development Environment

  • Setup Cloudera Hadoop VM
  • Adding Hadoop LIbraries 
  • Programming Languages

Module 4: MapReduce  2.0/YARN

  • What is MapReduce?
  • MapReduce Components
  • MapReduce on HDFS

Module 5: Hive

  • What is Hive?
  • Hive Queries
  • Analyzing data with Hive

Day 2

Module 6: Pig

  • What is Pig
  • Pig Data types
  • Pig Commands

Module 7: Connectors and Workflows

  • Introducing Sqoop
  • Importing Data with Sqoop
  • Introuducing Flume
  • Importing Data with Sqoop
  • Introducing Zookeeper
  • Using Zookeeper to co-ordindate workflow
  • Introducing Oozie
  • Scheduling jobs using Oozie

Module 8: Exploring Other Hadoop Libraries

  • Introducing Impala
  • Introducing Mahout
  • Introduing Storm

Module 8: Apache Spark Basics

  • Why Apache Spark?
  • Apache Spark Components
  • Apache Spark Commmands