[BUG] JVM_ATTRIBUTE_NOT_SUPPORTED when using shared cluster policy

Question

[BUG] JVM_ATTRIBUTE_NOT_SUPPORTED when using shared cluster policy

Closed this issue 7 months ago · 1 comments

Describe the bug
Can't run example notebook.py in cluster with shared policy

To Reproduce
Steps to reproduce the behavior:

Create a cluster with shared policy
try to run any python file in the cluster using the VS Code Databricks extension

System information:

Version: 1.87.0 (Universal)
Commit: 019f4d1419fbc8219a181fab7892ebccf7ee29a2
Date: 2024-02-27T23:42:56.944Z
Electron: 27.3.2
ElectronBuildId: 26836302
Chromium: 118.0.5993.159
Node.js: 18.17.1
V8: 11.8.172.18-electron.0
OS: Darwin arm64 23.3.0

Databricks Extension Version: 1.2.7
Cluster runtime 14.3.x

Databricks Extension Logs

3/7/2024, 10:58:38 AM - Running notebooks/notebook.py ...
---------------------------------------------------------------------------
PySparkAttributeError                     Traceback (most recent call last)
File /databricks/spark/python/pyspark/sql/connect/session.py:772, in SparkSession.__getattr__(self, name)
    770 def __getattr__(self, name: str) -> Any:
    771     if name in ["_jsc", "_jconf", "_jvm", "_jsparkSession", "sparkContext", "newSession"]:
--> 772         raise PySparkAttributeError(
    773             error_class="JVM_ATTRIBUTE_NOT_SUPPORTED", message_parameters={"attr_name": name}
    774         )
    775     return object.__getattribute__(self, name)
:772
PySparkAttributeError: [JVM_ATTRIBUTE_NOT_SUPPORTED] Attribute `_jvm` is not supported in Spark Connect as it depends on the JVM. If you need to use this attribute, do not use Spark Connect when creating your session. Visit https://spark.apache.org/docs/latest/sql-getting-started.html#starting-point-sparksession for creating regular Spark Session in detail.
3/7/2024, 10:58:43 AM - Done (took 23452ms)

Additional context
I can run the same notebook.py in a personal cluster but not the shared one.
I can run the same notebook.py in Databricks itself where it's uploaded in the shared cluster
I can change the cluster policy to Single User and it can then run the notebook.py just fine

I Followed this tutorial which says a personal cluster is recommended but not required.

Answer 1 · 2024-03-07T15:12:20.000Z

Hi @AndreasBoegh. Thanks for reporting the issue. This is a bug with our internal runner. Since the fix should be a simple one, we can most likely release it early next week,