databricks/databricks-vscode

[BUG] PATH environment variable conflicts for Venv in PowerShell

Closed this issue · 4 comments

When both the Python extension and Databricks extension are enabled in VS.Code, the Databricks extension somehow overwrites the venv's PATH environment variable in terminals, running/debugging Python scripts and also executing tests.

With both extensions installed, only the Databricks PATH is applied. This means that all other Python modules, ie. pip, aren't found in the right virtual environment as well as Python itself.

This happens, regardless of whether or not Databricks Connect is enabled.

Details:

The Python VS.Code extension prepends .venv execution directories in the PATH:

PATH=c:\Users\sambornk\.vscode\extensions\ms-python.python-2024.0.1\pythonFiles\deactivate\powershell;C:\Users\sambornk\source\scratch\venv_activation_test\.venv\Scripts;${env:PATH}

and the Databricks extension does something similar:

PATH=c:\Users\sambornk\.vscode\extensions\databricks.databricks-1.2.7-win32-x64\bin;${env:PATH}

PATH when the Databricks extension is disabled or not installed:

PS C:\Users\sambornk\source\scratch\venv_activation_test> $env:PATH
c:\Users\sambornk\.vscode\extensions\ms-python.python-2024.0.1\pythonFiles\deactivate\powershell;C:\Users\sambornk\source\scratch\venv_activation_test\.venv\Scripts;C:\Program Files (x86)\Common Files\Oracle\Java\javapath;C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\Wbem;C:\WINDOWS\System32\WindowsPowerShell\v1.0\;C:\WINDOWS\System32\OpenSSH\;C:\Program Files\dotnet\;C:\Program Files (x86)\Microsoft SQL Server\150\DTS\Binn\;C:\Program Files\Azure Data Studio\bin;C:\Program Files\Microsoft SQL Server\130\Tools\Binn\;C:\Program Files\Microsoft SQL Server\Client SDK\ODBC\170\Tools\Binn\;C:\Program Files (x86)\Microsoft SQL Server\110\DTS\Binn\;C:\Program Files (x86)\Microsoft SQL Server\120\DTS\Binn\;C:\Program Files (x86)\Microsoft SQL Server\130\DTS\Binn\;C:\Program Files (x86)\Microsoft SQL Server\140\DTS\Binn\;C:\Program Files (x86)\dotnet\;C:\Users\sambornk\AppData\Local\Programs\Python\Launcher\;C:\Users\sambornk\AppData\Local\Programs\Python\Python39\Scripts\;C:\Users\sambornk\AppData\Local\Programs\Python\Python39\;C:\Users\sambornk\AppData\Local\Microsoft\WindowsApps;C:\Users\sambornk\AppData\Local\Programs\Git LFS;C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\Common7\IDE\CommonExtensions\Microsoft\TeamFoundation\Team Explorer\Git\cmd;C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Common7\IDE\CommonExtensions\Microsoft\TeamFoundation\Team Explorer\Git\cmd;C:\Users\sambornk\AppData\Local\Programs\Microsoft VS Code\bin;

PATH when the Databricks extension is enabled:

S C:\Users\sambornk\source\scratch\venv_activation_test> $env:PATH
c:\Users\sambornk\.vscode\extensions\databricks.databricks-1.2.7-win32-x64\bin;C:\Program Files (x86)\Common Files\Oracle\Java\javapath;C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\Wbem;C:\WINDOWS\System32\WindowsPowerShell\v1.0\;C:\WINDOWS\System32\OpenSSH\;C:\Program Files\dotnet\;C:\Program Files (x86)\Microsoft SQL Server\150\DTS\Binn\;C:\Program Files\Azure Data Studio\bin;C:\Program Files\Microsoft SQL Server\130\Tools\Binn\;C:\Program Files\Microsoft SQL Server\Client SDK\ODBC\170\Tools\Binn\;C:\Program Files (x86)\Microsoft SQL Server\110\DTS\Binn\;C:\Program Files (x86)\Microsoft SQL Server\120\DTS\Binn\;C:\Program Files (x86)\Microsoft SQL Server\130\DTS\Binn\;C:\Program Files (x86)\Microsoft SQL Server\140\DTS\Binn\;C:\Program Files (x86)\dotnet\;C:\Users\sambornk\AppData\Local\Programs\Python\Launcher\;C:\Users\sambornk\AppData\Local\Programs\Python\Python39\Scripts\;C:\Users\sambornk\AppData\Local\Programs\Python\Python39\;C:\Users\sambornk\AppData\Local\Microsoft\WindowsApps;C:\Users\sambornk\AppData\Local\Programs\Git LFS;C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\Common7\IDE\CommonExtensions\Microsoft\TeamFoundation\Team Explorer\Git\cmd;C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Common7\IDE\CommonExtensions\Microsoft\TeamFoundation\Team Explorer\Git\cmd;C:\Users\sambornk\AppData\Local\Programs\Microsoft VS Code\bin;

you can see that the Virtual Environment's prepended PATH is missing.

To Reproduce
have both the Python VS.Code extension and Databricks VS.Code extension installed and enabled.
Create a Python Virtual Environment
Run a PowerShell terminal and look at $env:PATH

or run this simple script

import sys, os
import shutil

print(sys.version)

print(shutil.which("python"))

for name, value in os.environ.items():
    print("{0}: {1}".format(name, value))

Screenshots

this can be inspected by hovering over a terminal window and selecting "Show Environment Contributions"

image

You can then see what is happening:

image

(You can also see that somehow the Databricks authorisation variables are somehow listed as coming from the activated environment, but that's not right. Maybe it's related)

for what it's worth, the rest of the environment seems ok, other than the duplicated Databricks authorisation.

System information:

  1. Paste the output ot the Help: About command (CMD-Shift-P).

Version: 1.86.2 (user setup)
Commit: 903b1e9d8990623e3d7da1df3d33db3e42d80eda
Date: 2024-02-13T19:40:56.878Z
Electron: 27.2.3
ElectronBuildId: 26908389
Chromium: 118.0.5993.159
Node.js: 18.17.1
V8: 11.8.172.18-electron.0
OS: Windows_NT x64 10.0.19045

  1. Databricks Extension Version

v1.2.7

Hi @ksamborn. What version of the python extension are you on? Can you try with an older version of the python extension (2023.16 for example). I seem to have observed similar behaviour. I did not investigate the path issue, but the python environment was not getting updated. Downgrading seems to have fixed everything.

The env variable duplication is expected.

Hi Kartik -

Sorry - I thought i included that. I have v2024.0.1 installed. I'll take a look.

For what it's worth, I reported this as well - microsoft/vscode-python#22630 (comment)

So maybe it's all connected.

Thanks for the quick reply

@ksamborn did you try after downgrading the python extension? Does that work for you?

Hi Kartik - I apologise - I didn't try. After your comment, I went and looked at the VS.Code python extension open issues and there are many in this area. I think I will wait for them to fix it...

In the meantime, I can just manually set the PATH, and with Databricks Connect, PYSPARK_PYTHON isn't necessary.

thanks for following up!