gulrak/filesystem

Further optimization of directory iterators with POSIX backend

gulrak opened this issue · 2 comments

Is your feature request related to a problem? Please describe.
Further optimization of directory iteration on POSIX backend.

Describe the solution you'd like
Up until v1.5.2 the iteration issues stat calls even when their result is not needed, e.g. when the iteration only needs access to the filename and file type. This could be removed for a better performance in search scenarios. It should lead to a performance behavior comparable to existing POSIX backend std::filesystem implementations. (But it will lead to worse performance on cases where multiple additional attributes like file size or last write time are needed, where ghc::filesystem was faster.)

Additional context
The standard notes that an implementation should not call refresh (directly or implicit) on directory_entry, but work with the data given for free while iterating.

With 1037c02 the work on refining directory iteration on POSIX should be done. Tests iterating on large trees on an SSD show now timing results between those from std::filesystem::recursive_directory_iterator on GCC10/libstdc++ and Clang/libc++ and the implemented behavior is more conforming to the standard as it avoids system calls that are not absolutely necessary.

This is now part of release v1.5.4