/mapped_fileio

Primary LanguageRustGNU General Public License v3.0GPL-3.0

Memory-mapped File IO

This library allows reading files by using mmap() under the hood. Mapping a file into memory allows reading it as if it was a simple const char* array (or &[u8] in Rust terms).

Documentation is hosted here.

What is mmap()?

mmap() is a POSIX-compliant system call, which allows mapping files and devices into (virtual) memory.

How does this library work?

When opening files the following operations will be executed:

  1. open() is called to get a file descriptor.
  2. fstat() is called to get the file size. This is needed to let mmap() know how much memory to map. This number is also used for bondary checking.
  3. Using mmap() the file is mapped to memory.

The MappedFile structure will keep track of the current seek position. It also implements the following traits:

Dependencies

  • nix
    • Provides friendly bindings to various *nix platform APIs
    • Only the necessary features are enabled.

⚠️ Limitations

  1. This library only works on *nix-based systems (Linux, macOS). It does NOT support Windows.
  2. There's no write support yet.
    • Implementing this is simple, as long as the operation would not modify the file size.

Does this library use unsafe operations?

Yes. Unfortunately, there's no way around this.

  • System calls are from Rust's perspective "unsafe", since they are external C functions.
  • Usage of raw pointers

Will large files take up a lot of RAM?

This heavily depends on how the OS decides to map the file. Such mappings are managed by the kernel automatically, and you don't have any control over it.

Either way, most operating systems are smart enough to not automatically map a huge file directly into RAM. mmap() will almost always use virtual memory, instead of the physical RAM.

To put it simply, you don't really need to worry.

Notes on seeking

The documentation of the seek() function says that it shall allow seeking beyond the end of the file, however since such an operation would very likely cause a segfault, this library does not permit such operations. If a seek is attempted with an invalid offset, an error is returned.

Closing files

When a MappedFile goes out of scope, the memory is automatically unmapped using munmap() and the file descriptor is closed.