[BUG] unable to run spark code - BAD_REQUEST: Spark Connect (vsix from nightly run - 5455543994)
sauerchextern opened this issue · 3 comments
Hi,
first of all, your changes and adjustments look really promising! Thanks a lot.
Describe the bug
Whenever I run pyspark code in visual studio code (run cell, debug cell or run python file) I receive the following message:
"BAD_REQUEST: Spark Connect is enabled only on Unity Catalog enabled Shared and Single User Clusters."
To Reproduce
Steps to reproduce the behavior:
- Create Cluster
- Go to Visual Studio Code
- Install Artifact from this nightly run: https://github.com/databricks/databricks-vscode/actions/runs/5455543994
- debug cell with any spark code e.g. spark.sql("USE default") or use the code below (additional context).
Additional context
import os
import sys
from datetime import date
from databricks.connect import DatabricksSession
from pyspark.sql.types import *
# COMMAND ----------
spark = DatabricksSession.builder.getOrCreate()
# COMMAND ----------
# Create a Spark DataFrame consisting of high and low temperatures
# by airport code and date.
schema = StructType([
StructField('AirportCode', StringType(), False),
StructField('Date', DateType(), False),
StructField('TempHighF', IntegerType(), False),
StructField('TempLowF', IntegerType(), False)
])
# COMMAND ----------
data = [
['BLI', date(2021, 4, 3), 52, 43],
['BLI', date(2021, 4, 2), 50, 38],
['BLI', date(2021, 4, 1), 52, 41],
['PDX', date(2021, 4, 3), 64, 45],
['PDX', date(2021, 4, 2), 61, 41],
['PDX', date(2021, 4, 1), 66, 39],
['SEA', date(2021, 4, 3), 57, 43],
['SEA', date(2021, 4, 2), 54, 39],
['SEA', date(2021, 4, 1), 56, 41]
]
temps = spark.createDataFrame(data, schema)
# COMMAND ----------
# Create a table on the Databricks cluster and then fill
# the table with the DataFrame's contents.
# If the table already exists from a previous run,
# delete it first.
spark.sql('USE default')
spark.sql('DROP TABLE IF EXISTS zzz_demo_temps_table')
temps.write.saveAsTable('zzz_demo_temps_table')```
Hi @sauerchextern. I am not able to repro this. Can you check that the spark remote environment variable has the correct cluster id? You can do an os.environ['SPARK_REMOTE']
.
cc @nija-at do you know what could be the issue (assuming the configs from vscode are correct)?
Hi @sauerchextern. I am not able to repro this. Can you check that the spark remote environment variable has the correct cluster id? You can do an
os.environ['SPARK_REMOTE']
.cc @nija-at do you know what could be the issue (assuming the configs from vscode are correct)?
Unity was not activated in the workspace, therefore not in the cluster.