A HIVEQL TASK

Getting Started:

In order to run the queries with the data provided, databricks notebook community edition is recommended.

Step 1: Hive is a tool that provides SQL querying of data stored in HDFS /HBase. Aside from Python kernel, Databricks offers an SQL notebook. The SQL version is designed to be compatible with Apache Hive i.e. can develop on Databricks in SQL notebooks and then run in Hive. The shell command %sql should be usd to start querying.
Step 2: Import the datasets into databricks and create a new notebook.
Step 3: Use a shell command %sql then start writing queries to solve the tasks.