Prophecy is designed to enable all users to be productive with data engineering. It also replaces legacy ETL products. To learn more about Prophecy visit https://docs.prophecy.io/.
Prophecy-build-tool (PBT) allows you to quickly build projects generated by Prophecy (your standard Spark Scala and PySpark pipelines) to integrate them with your own CI / CD (e.g. Github Actions), build system (e.g. Jenkins), and orchestration (e.g. Databricks Workflows).
For the latest information on how to use Prophecy-build-tool please visit https://docs.prophecy.io/metadata/prophecy-build-tool.
To install PBT, simply run:
pip3 install prophecy-build-tool
To build and deploy your Prophecy project containing Python/Scala projects and Databricks Jobs run
pbt deploy --path /path/to/your/prophecy_project/
Sample output:
Prophecy-build-tool v1.0.0
Found 1 pipelines: customers_orders (python)
Found 1 jobs: daily_report
Building 1 pipeline 🚰
Building pipeline pipelines/customers_orders [1/1]
running build
running build_py
✅ Build complete!
Deploying 1 job ⏱
Deploying job jobs/daily_report [1/1]
Uploading cs-1.0-py3-none-any.whl to dbfs:/FileStore/prophecy/artifacts/...
Updating an existing job: daily_report
✅ Deployment complete!
To just build your Prophecy project containing Python/Scala projects and Databricks Jobs run
pbt build --path /path/to/your/prophecy_project/
To build and Deploy all jobs in your project, run deploy command
pbt deploy --path /path/to/your/prophecy_project/
It's also possible to only deploy jobs associated with given fabrics, you can provide fabrics id's (comma separated) to just deploy only those jobs.
pbt deploy --fabric-ids 3,7 --path /path/to/your/prophecy_project/
By default, deploy
command builds all pipelines and then deploys them, if you want to skip building all pipelines
( this could be useful, if you are running a deploy
command after running deploy
or build
previously.)
pbt deploy --skip-builds --path /path/to/your/prophecy_project/
To run all unit tests in your Prophecy project containing Python/Scala projects and Databricks Jobs run
pbt test --path /path/to/your/prophecy_project/
Sample output:
Prophecy-build-tool v1.0.1
Found 1 jobs: daily
Found 1 pipelines: customers_orders (python)
Unit Testing pipeline pipelines/customers_orders [1/1]
============================= test session starts ==============================
platform darwin -- Python 3.8.9, pytest-7.1.2, pluggy-1.0.0 -- /Library/Developer/CommandLineTools/usr/bin/python
cachedir: .pytest_cache
metadata: None
rootdir: /Users/kiran/proj2/pipelines/customers_orders/code
plugins: html-3.1.1, metadata-2.0.2
collecting ... collected 1 item
test/TestSuite.py::CleanupTest::test_unit_test_0 PASSED [100%]
============================== 1 passed in 17.42s ==============================
✅ Unit test for pipeline: pipelines/customers_orders succeeded.
To quickly validate if all your pipelines are not broken
pbt validate --path /path/to/your/prophecy_project/
Sample output:
Prophecy-build-tool v1.0.3.4
Project name: HelloWorld
Found 1 jobs: default_schedule
Found 4 pipelines: customers_orders (python), report_top_customers (python), join_agg_sort (python), farmers-markets-irs (python)
Validating 4 pipelines
Validating pipeline pipelines/customers_orders [1/4]
Pipeline is validated: customers_orders
Validating pipeline pipelines/report_top_customers [2/4]
Pipeline is validated: report_top_customers
Validating pipeline pipelines/join_agg_sort [3/4]
Pipeline is validated: join_agg_sort
Validating pipeline pipelines/farmers-markets-irs [4/4]
Pipeline is validated: farmers-markets-irs
Prophecy-built-tool (PBT) can be integrated with CI/CD (eg: Github Actions). A sample Github Actions .yml file that builds, tests and deploys the project on every code push is mentioned below:
name: Example CI
on: [push]
env:
DATABRICKS_HOST: "https://<<databricks-host>>.databricks.com"
DATABRICKS_TOKEN: "<<databricks-token>>"
FABRIC_NAME: "fabric-name"
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up JDK 11
uses: actions/setup-java@v3
with:
java-version: '11'
distribution: 'adopt'
- name: Set up Python 3.x
uses: actions/setup-python@v4
with:
python-version: '3.x'
# Install all python dependencies
# prophecy-libs not included here because prophecy-build-tool
# takes care of it by reading each pipeline's setup.py
- name: Install dependencies
run: |
python3 -m pip install --upgrade pip
pip3 install build pytest wheel pytest-html pyspark prophecy-build-tool
- name: Run PBT build
run: pbt build --path .
- name: Run PBT test
run: pbt test --path .
- name: Run PBT deploy
run: pbt deploy --path .