DataDog/glommio

`append` changes `write_at` in a weird way that's not well-documented

vlovich opened this issue · 1 comments

The test below will fail if you uncomment the "append" line. The reason is that the position is ignored for all writes. It may be nice to put a clearer warning in the docs for append that it changes write_to in this way + a callout in write_to that pos is ignored when the file is opened append.

Overall, I'm kind of curious what the use-case append solves is. Is it to avoid some kind of TOCTTOU issue around vs just reading the size of the file and using that as the EOF position?

    #[test]
    fn glommio_write_with_gaps() {
        test_executor!(async move {
            use tempfile::Builder;

            let temp_dir = Builder::new()
                .prefix("glommio_write_with_gaps")
                .tempdir()
                .unwrap();
            let path = PathBuf::from(temp_dir.path()).join("my_test.bin");

            let dma_file = glommio::io::OpenOptions::new()
                .create(true)
                .read(true)
                // .append(true)
                .write(true)
                .dma_open(path)
                .await
                .unwrap();
            let write_length = dma_file.align_up(1) as usize;
            let mut buffer = dma_file.alloc_dma_buffer(write_length);
            buffer.as_bytes_mut()[0] = 5;
            buffer.as_bytes_mut()[1..].fill(0);
            dma_file.write_at(buffer, 0).await.unwrap();
            assert_eq!(dma_file.file_size().await.unwrap(), write_length as u64);

            let mut buffer = dma_file.alloc_dma_buffer(write_length);
            buffer.as_bytes_mut()[0] = 6;
            buffer.as_bytes_mut()[1..].fill(0);
            dma_file.write_at(buffer, 2 * 1024 * 1024).await.unwrap();
            assert_eq!(dma_file.file_size().await.unwrap(), 2 * 1024 * 1024 + write_length as u64);

            let buffer = dma_file.read_at_aligned(0, write_length).await.unwrap();
            assert_eq!(buffer.len(), write_length);
            assert_eq!(buffer[0], 5);
            assert_eq!(buffer[1], 0);

            let buffer = dma_file.read_at_aligned(2 * 1024 * 1024, write_length).await.unwrap();
            assert_eq!(buffer.len(), write_length);
            assert_eq!(buffer[0], 6);
            assert_eq!(buffer[1], 0);
        });

This is very surprising to me.

but in a nutshell, just shows how complex and full of edge cases things are =)

I checked the man page, which says this:

O_APPEND
              The file is opened in append mode.  Before each [write(2)](https://man7.org/linux/man-pages/man2/write.2.html),
              the file offset is positioned at the end of the file, as
              if with [lseek(2)](https://man7.org/linux/man-pages/man2/lseek.2.html).  The modification of the file offset and
              the write operation are performed as a single atomic step.

              O_APPEND may lead to corrupted files on NFS filesystems if
              more than one process appends data to a file at once.
              This is because NFS does not support appending to a file,
              so the client kernel has to simulate it, which can't be
              done without a race condition.

I ignored the paragraph that talks about NFS, because everything leads to data corruption on NFS, the only thing that doesn't, is not using NFS.

The interesting bits are:

Before EACH write.

So the behavior your describe makes sense. Although this is the Linux standard behavior, this is surprising enough that I agree that a comment would be welcome.