databricks/databricks-vscode

[FEATURE] Support for SparkContext and RDDs in Python notebooks when debugging locally

alex-woodhouse opened this issue · 2 comments

I have been very keen to use the extension for a number of months and I think almost everything has been implemented that our project uses aside from SparkContext and RDDs. I can see here that Databricks Connect currently does not support these features. Do you know if there is a plan to support these in the near future?

I'm not able to use pdb to debug in my online Databricks workspace either because breakpoints don't seem to work when you are using streaming in a notebook.

I'd be very grateful for any suggestions if I've missed anything.

Hi @alex-woodhouse. We currently do not plan on supporting RDDs or SparContext, unfortunately. We are working on other features which allow debugging of the entire python code on the cluster. Currently we do not have a fixed timeline for this either. I will leave the ticket open to gather interest.

For local debugging, we are relying on SparkConnect, which by design doesn't support RDDs and spark context. This is not something we can address from the VS Code side.