Azure/azure-cosmosdb-spark

Unable to Write or Read (un-encrypted) data from Cosmos DB Containers created via Cosmos DB SDK Encryption

Opened this issue · 3 comments

We have an Encrypted Container, created using Cosmos DB SDK, https://learn.microsoft.com/en-us/azure/cosmos-db/how-to-always-encrypted?tabs=dotnet. We are able to read-write as suggested in the Hyperlink using Function App.

We are trying to read-write data from this same Container via Databricks (using Cosmos DB library 2.3.0), but unable to.

When Writing to this Container from Databricks, we get error: "CosmosHttpResponseError: (BadRequest) Message: {"Errors":["Collection has ClientEncryptionPolicy set, but the document to be written isn't encrypted."]}"

While Reading from this Container from Databricks, we are able to read the data, but it comes back encrypted (it should come back decrypted / plain-text).

We have an open ticket with Microsoft on this, and upon investigation found out that this Library (Cosmos DB library 2.3.0) had no connection available in DLL.

My question is: When Databricks is offered as a service to interact with Cosmos DB, why are all the Features (available via Cosmos DB SDK), not available in Databricks Libraries?

Hello @ssharma444. You are correct that not all Cosmos DB features are supported in all Cosmos DB client libraries. In this case, demand for the feature was much higher in our SDKs than in our Spark Connector, where demand for this feature has been very low. With that said, this is still on our backlog, and currently our plan is to deliver encryption support for the Spark 3 Connector within the next 3-6 months.

Meanwhile, I want to point out that encryption support will not be delivered in the version of the Spark Connector that this repo relates to. This is the repo for the older Cosmos Spark Connector for Spark 2 (note: Spark 2 itself will soon be end of life on Databricks). This connector and is also based on deprecated versions of our Java SDK.

We strongly recommend upgrading to the Spark 3 OLTP Connector if possible, as this is where we will be adding encryption support. Feel free to raise this issue in the repo where our latest Spark Connector lives: https://github.com/Azure/azure-sdk-for-java/tree/main/sdk/cosmos/azure-cosmos-spark_3-2_2-12. We will track that issue against our work items for completion of this feature.

Hello @TheovanKraay , Is encryption support added to spark 3 OLTP connector already? I am facing the same issue as described by @ssharma444

Hi @atulsi1 - no, support for client-side encryption has not been added to the Spark Connector yet. And there are currently no plans to add it within the next 6 months at least.