databricks/databricks-vscode

[BUG] "Missing required field 'UserContext' in the request." when using Databricks Connect

felipeff opened this issue · 4 comments

Describe the bug

After following all the steps provided on https://learn.microsoft.com/en-us/azure/databricks/dev-tools/vscode-ext#--run-or-debug-python-code-with-databricks-connect to enable the new beta feature to debug python code using the latest version of Databricks Connect
I get the following exception:

Exception has occurred: SparkConnectGrpcException
<_InactiveRpcError of RPC that terminated with:
status = StatusCode.INVALID_ARGUMENT
details = "Missing required field 'UserContext' in the request."
debug_error_string = "UNKNOWN:Error received from peer {grpc_message:"Missing required field 'UserContext' in the request.", grpc_status:3, created_time:"2023-04-26T21:11:46.657302244+00:00"}"

grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.INVALID_ARGUMENT
details = "Missing required field 'UserContext' in the request."
debug_error_string = "UNKNOWN:Error received from peer {grpc_message:"Missing required field 'UserContext' in the request.", grpc_status:3, created_time:"2023-04-26T21:11:46.657302244+00:00"}"

During handling of the above exception, another exception occurred:

File "D:\Dev\Gitlab\azure-databricks\Test.py", line 6, in
print(df)
pyspark.errors.exceptions.connect.SparkConnectGrpcException: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.INVALID_ARGUMENT
details = "Missing required field 'UserContext' in the request."
debug_error_string = "UNKNOWN:Error received from peer {grpc_message:"Missing required field 'UserContext' in the request.", grpc_status:3, created_time:"2023-04-26T21:11:46.657302244+00:00"}"

To Reproduce
Steps to reproduce the behaviour:

  1. Install v0.3.11 of the plugin
  2. follow all the steps outlined here: https://learn.microsoft.com/en-us/azure/databricks/dev-tools/vscode-ext#--run-or-debug-python-code-with-databricks-connect
  3. Create a new Python file on visual studio with the following code snippet and run it

from databricks.connect import DatabricksSession spark = DatabricksSession.builder.getOrCreate() df = spark.range(1,11) df.show()

System information:

  1. Paste the output ot the Help: About command (CMD-Shift-P).
    Version: 1.77.3 (system setup)
    Commit: 704ed70d4fd1c6bd6342c436f1ede30d1cff4710
    Date: 2023-04-12T09:16:02.548Z
    Electron: 19.1.11
    Chromium: 102.0.5005.196
    Node.js: 16.14.2
    V8: 10.2.154.26-electron.0
    OS: Windows_NT x64 10.0.22621
    Sandboxed: Yes

  2. Databricks Extension Version
    0.3.11

Databricks Extension Logs
I can't provide the full logs because of some of the information there like workspace URL, cluster id, token id, etc. If you can provide an email address I can send those to whoever is investigating this issue directly if needed.

@nija-at can you provide some more color on circumstances under which we can see these errors from dbconnect?

This occurs when the environment does not have $USER variable set.

The workaround here is to add user_id=none to the connection string.

We have a fix for this and will be released in the next DB Connect release where this workaround is no longer required.

@felipeff , for now, you can set the USER variable. To do this:

  1. Create a .env file in the root of your project (this is the file pointed to by the databricks.python.envFile variable).
  2. Add USER="" to the file in a new line.

You should be able to use dbconnect after that.

@kartikgupta-db thanks. This workaround did the trick. dbconnect works now