Metis-Data

A set of utility functions and classes to aid in running jobs and libraries on a spark cluster, mostly on
the Databricks Platform, and targeted at AWS deployments.

TODO: This is a copy from the old jobworthy repo, and requires tons of updates to match the metis_data API.

Spark Job

Spark Job
Job Configuration

Util Module

Spark Session

Repo Module

The Repo Module

Repository Module

Database and Table

Schema Module

The Schema module provides functions for building a more abstract definition of a Hive table schema and abstractions for creating table, column and cell data which can be provided as the data argument when creating a dataframe.