Big Data Analysis with Apache Hive

These are the exercise files used for Big Data Analysis with Apache Hive course.

The course outline can be found in

Module 1: Get Started on Apache Hive

  • What is Hive?
  • How Hive Work
  • Setup Hive with VirtualBox and CDH

Module 2: Manipulating Data in Hive

  • Data Structures in Hive
  • Ceating Tables in Hive
  • Handling CSV files in Hive
  • Partitioning Tables

Module 3: Retrieving Data from Hive

  • Restrieving data with SELECT
  • Retrieving Data from Complex Structures

Module 4: Aggregating Data with Hive

  • Simple Aggregations
  • Grouping Sets
  • Using CUBE and ROLLUP

Module 5: Filtering Reults with Hive

  • Simple filter with WHERE
  • Filtering aggregates with HAVING
  • Finding similar values with LIKE

Module 6: Joining Tables 

  • Comibining tables with JOIN
  • Where to use SEMI JOIN
  • Joining multiple tables together

Module 7: Manipulating Data

  • Data Manipulating Functions
  • String Functions
  • Math Functions
  • Date Functions
  • Conditonal Functions