laurabethstone/cicd-databricks-collection

Python

cicd-databricks-helpers

[all scopes of databricks project CICD]

3p batch processer
- ActiveBatch
  - runs the deploy or other env scrips
  - ssh github repo to AB
- Delta SymlinkManafest
- aws Role permissions -> table owner
spark w/ databricks dbt
- run local context
- run ui all purpose
- run job w/ deploy local
- run job w/ github actions pipenv [script] "deploy.py"
- common utils IO wrapper
- - logger
  - set context
  - dbutils widgets
- runtime
- self python functions _get(spark)
mlops
- infastructure as code (iac)
- - terraform
- - jinja2
    - json.j2 or yaml deploy file
    - use native jinja env variable w/ venv setup
    - use jinja variable w/ dbutils and widgets
    - use jinja template to auto populate
- build the deploy api componets dynamically
- github actions
- github runners
- permission chain aws -> appId
pytest
- Unit
- Ingest
- Checks
databricks-api
tableau-api-lib
- control the analytic end point
resources
- ec2 selection
- cluster monitoring
- setting spark conf
bi analytic golden rules
- users will want a data dictionary
- users will want visual consistency
- users will want simple not loud
- users will want flexibility of choice
- users will want to export data by their choosing
- users will want the data to match their CMS -- use sys report as viz source
tbd