Cannot write mmap'ed buffer content to disk using O_DIRECT flag
iMineLink opened this issue ยท 10 comments
Hello, I'm using this driver on Petalinux 2022.2 on a Xilinx RFSoC to receive data from the PL via DMA and dump it to SSD.
Recently, I'm switching to a zerocopy implementation of the software, where the O_DIRECT flag is used to write to SSD the data in the mmap'ed buffer (after manual cache synchronization).
The usage of the O_DIRECT flag causes the write to return the "Bad address" error, whereas not using the O_DIRECT flag allows the write to succesfully execute.
The number of bytes written in each write is a multiple of the page size (4096), and also I checked that the address of the buffer is indeed aligned to the page size.
Do you know what could be causing the "Bad address" error in the write when using the O_DIRECT flag and what could be done to fix it?
I can avoid using the O_DIRECT flag but indeed it moves me further from a zerocopy implementation of the data dumper.
Thanks a lot for your great effort in writing this driver,
Best Regards
Thanks for the issue.
The Direct I/O mechanism using the O_DIRECT flag is only valid for filesystems such as ext4 or brtfs.
u-dma-buf is not a filesystem and does not support Direct I/O.
Thanks for your reply and explanation.
I wanted to give some clarifications on my issue.
The destination of the write is indeed a file on an Ext4 filesystem, while the source of the write is the u-dma-buf mmap'ed buffer. This causes the "Bad address" error.
If I first memcpy the data from the u-dma-buf buffer to a local page-aligned buffer and then use the new buffer as write source, the error goes away. The error goes away also if I only remove the O_DIRECT flag while using u-dma-buffer as source for the write.
Is this issue still related to u-dma-buf not supporting direct I/O? If so, do you foresee how could I contribute in adding such feature?
Thanks again,
Regards
There are severe restrictions on the memory addresses that can be specified for Direct I/O with O_DIRECT, and these restrictions are a black box within the Linux Kernel that is difficult to understand.
Sorry, I do not understand this.
There is a similar thread on the Xilinx forum, for example, but no one can answer it.
I'm sorry I couldn't be of any help.
Again, thank you very much, the restrictions are really hard to understand and part of a black box, as you said! If I will ever find anything useful, I'll let you know. Maybe we can leave this issue open for a while for others to contribute if they encounter the same problem?
Yes. I agree with your suggestion to leave this issue open and wait for someone else to resolve it.
I just wanted to post this answer which I found on Stack Overflow and seems to be related to this issue: https://stackoverflow.com/a/73032605/2627697
The suggestion there seems to be to avoid setting the VM_IO and VM_PFNMAP flags to the vma, but I'm not really aware of what could be the downsides of this, or if it is at all a legal solution in this case.