databricks/databricks-vscode

Question

Closed this issue · 1 comments

Is it needed to upload all the virtual environment to databricks in order to run a file in a cluster's notebook? I haven't found related documentation and not sure if the virtual environment must be synced with databricks workspace

Hi @santiagortiiz. If you are using notebooks wihin vscode with a local kernel, then you do not need to sync anything.

If you are using 'Run file as Workflow" functionality, you will need the local environment replicated on the cluster. You can do this by either installing libraries on the cluster directly or adding the relevant pip installs cells to your notebook. I believe a %pip install -r requirements.txt should work.

In any case, we do NOT recommend syncing your virtual environment to the cluster using the sync functionality of the extension. It is likely to fail and we can't guarantee that correct libraries will be used.