Install Delight on Databricks without downloading a jar at runtime using an init script
jystephan opened this issue · 1 comments
Is your feature request related to a problem? Please describe.
We don't want to use an init script to install Delight on Databricks.
We don’t like to download the jar every time (due to many concerns) and we need to associate it with dynamically created clusters, and don’t want to set global init scripts to avoid affecting the production
I've tried using Databricks cluster library features:
- Download the delight_2.12.jar to S3 and Attached the delight_2.12.jar to the cluster library, and
- Modified the init scripts and removed wget
- Keep the rest as the same (or using Set spark_conf to specify spark listeners and access token)
- Pass init scripts location (s3) to the databricks init_scripts = [ ] in cluster API json payload.
but it doesn't seem to work, even though I have checked:
- The delight_2.12.jar is attached to the library
- The init_scripts are completed in the cluster event
- The spark configure shows the both access token is specified and spark listener is specified with correct value
Describe the solution you'd like
Ability to install Delight without downloading a jar at runtime from an init script.
In above scenarios, the difference from the original scripts is that the delight.jar is not copied to /mnt/xxx directory ( but installed directly in databricks cluster library (via cluster API), and the init scripts are not from UI but from Databricks Cluster API init_script payload. The spark application ran without issue, but there is nothing captured on the delight UI either.
Strongly suggests open source the UI as well, as it becomes a black box, make it hard to troubleshooting issues.