I want to display the latest BLOB data using streaming method.
Shunya-Seki opened this issue · 9 comments
Which version of blobfuse was used?
Blobfuse2 version: 2.3.0~preview.1
Which OS distribution and version are you using?
Rhel8.8
If relevant, please share your mount command.
◆mount command
blobfuse2 mount /blobmount --config-file=/etc/blobfuse2config.yaml -o allow_other
◆config.yaml
Refer ./setup/baseConfig.yaml for full set of config parameters
logging:
type: syslog
level: log_debug
components:
- libfuse
- stream
- attr_cache
- azstorage
libfuse:
attribute-expiration-sec: 0
entry-expiration-sec: 0
negative-entry-expiration-sec: 0
direct-io: true
stream:
block-size-mb: 0
max-buffers: 0
buffer-size-mb: 0
attr_cache:
timeout-sec: 7200
azstorage:
type: block
account-name: xxxxx
objid: xxxxxxxxx
endpoint: xxxxxx
mode: msi
container: xxxx
What was the issue encountered?
I want to use the Stream method and ensure that the data in Blob Storage is always up to date on the OS side where the mount is performed.
Is the above configuration okay?
I want to confirm just in case.
Have you found a mitigation/solution?
The configuration seems to be working fine, and the latest BLOB data is being displayed without any issues.
Please share logs if available.
If you want to refresh the contents locally as and when they are updated on the container then this configuration will not work. What you need here is to use '-o direct_io' cli parameter. 'streamis not a stable component so you can migrate to
block-cache` instead. Sample command and config below :
blobfuse2 mount /blobmount --config-file=/etc/blobfuse2config.yaml -o allow_other -o direct_io
logging:
type: syslog
level: log_debug
components:
libfuse
block_cache
attr_cache
azstorage
libfuse:
attribute-expiration-sec: 0
entry-expiration-sec: 0
negative-entry-expiration-sec: 0
block_cache:
block-size-mb: 8
mem-size-mb: 2048
prefetch: 12
parallelism: 64
attr_cache:
timeout-sec: 7200
azstorage:
account-name: xxxxx
objid: xxxxxxxxx
mode: msi
container: xxxx
I see you are using objid for MSI based authentication. It's adivsed to change to appid
based authentication as objid based is not natively supported and needs azcli
as well to be installed. If you are using Azure VM then you can assign the identity to the VM itself and then skip providing any appid/objid here in the config file.
Closing this as there is no action item on blobfuse here. Feel free to post your questions/queries here.
Thank you for the information. I've implemented the provided config and mount, but it's not updating.
I'm checking the content with the following steps:
①Configuration settings (Received config file)
②Mounting
blobfuse2 mount /blobmount --config-file=/etc/blobfuse2config.yaml -o allow_other -o direct_io
③Confirming the content with the following command
cat /blobmount/contents
④Updating the content (From Azure Portal)
⑤Reconfirming the content with the following command
cat /blobmount/contents
Additionally, I was able to mount it without any issues, skipping the "objid".
(I'm using Azure VM)
Remove "attr_cache" from "components" section in your config file and remount.
As you have enabled log debug you can check the logs when you issue cat command for the second time. You shall receive a file open call for that and some downloads shall happen. If thats not happening then you can share the log files with us.
I removed 'attr_cache' from the config and remounted. Now the latest content is being displayed. Thank you for your assistance.
Could you provide additional information?
Would it be okay for all settings of 'block_cache' to be set to '0' when displaying always-updated content, as in this question?
block-size-mb:
mem-size-mb:
prefetch:
parallelism:
No in block-cache model you can not set all these parameters to 0 as that means you do not have any memory allocated to hold the incoming data. Based on your available memory and average file size you can tune these parameters.
Thank you.
The memory of the Azure VM and the average of the read files are as follows. Are there any recommended values for these parameters in this case?
The memory of the Azure VM:32GiB
average of the read files(read only):0.8GB
◆Parameters of Block_cache
block-size-mb:
mem-size-mb:
prefetch:
parallelism:
Is there a workload to determine the parameters for "block-cache"?
keep "block-size-mb" to 16 and based on avilable memory space you can allocate "mem-size-mb". I see you have 32GB memory in your VM so you can put 20GB for this value (if you are mounting only one instance of blobfuse and there are no other memory hungry applications running on same node). "prefetch" you can set to 50 as avg file size is not too huge. For "parallelism" you can set it to 50 as well.