Azure/spark-cdm-connector

Cannot be used in Azure China

Opened this issue · 2 comments

Spark CDM Connector doesn't use the correct fs.azure.account.oauth2.client.endpoint (which should be https://login.partner.microsoftonline.cn) for hostname like *.dfs.core.chinacloudapi.cn in Azure China, therefore it cannot find the manifest.cdm.json file. For Azure Global, the endpoint is https://login.microsoftonline.com/. The configuration is set in the class com.microsoft.cdm.utils.SerializedABFSHadoopConf.

Hi @simon
Can you try to use the token based access control ? Do you see the same issues?

@simonzhaoms We had a similar issue trying to connect to Azure Government. The fix for us was to use the token based access control (i.e. managed identity) as @srichetar mentioned. The piece that's missing from the documentation is that your user needs the Storage Blob Data Contributor role on the storage account, even if your user has the regular Owner or Contributor roles.