A set of utility functions and classes to aid in running jobs and libraries on a spark cluster, mostly on
the Databricks Platform, and targeted at AWS deployments.
TODO: This is a copy from the old jobworthy repo, and requires tons of updates to match the metis_data API.
- Spark Job
- Job Configuration
- Spark Session
The Schema module provides functions for building a more abstract definition of a Hive table schema and abstractions for creating table, column and cell data which can be provided as the data argument when creating a dataframe.