Azure/azure-storage-fuse

Unable to mount the Datalake storage via proxy on an Azure virtual machine

kumaran-sowrirajan opened this issue · 2 comments

Which version of blobfuse was used?

blobfuse2 version 2.3.0

Which OS distribution and version are you using?

Linux (ubuntu 20.04) installed in Azure Virtual Machine

If relevant, please share your mount command.

sudo blobfuse2 mount all ./datalakecontainer/ --config-file=./fuse2_config.yaml --foreground=true --disable-version-check=true --log-level=LOG_DEBUG

What was the issue encountered? I am unable to mount the Datalake storage via proxy on an Azure virtual machine.

  1. Tried with the access key and the shared access signature.

  2. Both the virtual machine, storage account are under same subscription and located in the same region.

  3. Able to access (create/read/delete a file) the storage account via azure java sdk and via curl command from the azure virtual machine.

  4. The public access is NOT allowed and private endpoints are enabled on this storage account.

  5. Tried with both the blob and dfs endpoint:

    https://<STROAGE_ACCOUNT_NAME>.blob.core.windows.net/
    https://<STROAGE_ACCOUNT_NAME>.dfs.core.windows.net/

  6. No logs are being generated to see what is going on. Only Error what is see is below from the terminal:

Error: failed to get container list from storage [Get "https://<STORAGE_ACCOUNT_NAME>.blob.core.windows.net/?comp=list&": proxyconnect tcp: dial tcp: lookup : no such host]

Have you found a mitigation/solution? No

Not sure whether this issue is related with Azure blob fuse via a proxy but the temporary mitigation explained on this issue did not work.

Below is the fuse2_config.yaml

logging:
  type: syslog
  level: log_debug

components:
  - libfuse
  - block_cache
  - file_cache
  - attr_cache
  - azstorage

libfuse:
  attribute-expiration-sec: 120
  entry-expiration-sec: 120
  negative-entry-expiration-sec: 240

block_cache:
  path: ~/blocks

file_cache:
  path: ~/local_cache

attr_cache:
  timeout-sec: 7200

azstorage:
  type: adls
  account-name: <STROAGE_ACCOUNT_NAME>
  account-key: <ACCOUNT_ACCESS_KEY>
  endpoint: https://<STROAGE_ACCOUNT_NAME>.blob.core.windows.net/
  mode: key
  container: <CONTAINER_NAME>
`
### Please share logs if available.
  • You are using syslog so the logs will be redirected to linux syslog service. You can locate logs under /var/log/syslog OR /var/log/messages OR /var/log/blobfuse2.log based on your system settings.
  • In terms of configuration you need to correct a few things:
    • Both block-cache and file-cache can not coexist so remove - block_cache under components section from your config file.
    • type: adls shall be given only if your account is HNS
    • If its HNS account then endpoint provided is incorrect as for HNS account it shall be .dfs.core.windows.net and not .blob.*
  • If you are using proxy environment make sure https_proxy environment variable is exposed and is accessible to blobfuse2 process. Many time we have seen customers configuring environment variables and mounting using sudo where same env variables are not exposed.
  • If you are using private endpoint
    - either endpoint in azstorage section shall point to the private endpoint
    - OR you shall have a DNS resolution where account.blob.core.windows.net can be resolved back to your private endpoint
    - If it's an HNS account, make sure you have private endpoints configured for both blob and dfs endpoints.
  • You are using syslog so the logs will be redirected to linux syslog service. You can locate logs under /var/log/syslog OR /var/log/messages OR /var/log/blobfuse2.log based on your system settings.

  • In terms of configuration you need to correct a few things:

    • Both block-cache and file-cache can not coexist so remove - block_cache under components section from your config file.
    • type: adls shall be given only if your account is HNS
    • If its HNS account then endpoint provided is incorrect as for HNS account it shall be .dfs.core.windows.net and not .blob.*
  • If you are using proxy environment make sure https_proxy environment variable is exposed and is accessible to blobfuse2 process. Many time we have seen customers configuring environment variables and mounting using sudo where same env variables are not exposed.

  • If you are using private endpoint

    • either endpoint in azstorage section shall point to the private endpoint
    • OR you shall have a DNS resolution where account.blob.core.windows.net can be resolved back to your private endpoint
    • If it's an HNS account, make sure you have private endpoints configured for both blob and dfs endpoints.

Thank you. Both blob and dfs endpoint worked.