Suggestion: Avoid using static 8K(i?) blocks sizes.
Closed this issue · 2 comments
Instead, you should use the stat family of functions (stat(2) and statfs(2) in particular) to obtain the recommended block size for the file and file-system being targeted, and use that. An old disk may use a sector size of 512 bytes. A CD or DVD may use a size of 4096 (4Ki) bytes. A USB "thumb drive" or external hard drive may use even larger (I have one that uses 16KiB sectors).
It is reading/writing 8MB at a time, not 8kB.
My error in reading my small screen! You might scale the value returned from the stat call with various methods. I wrote a small tool a decade ago that would scale to about 25% memory usage (using a bit shift of MEMSIZE>>2 to be slightly faster than explicit division) in order to calculate multiple file hashes and check sums. The first version did not use the block size from the stat call. The second did, and I saw a significant (about 3x faster) increase in performance, as the kernel was able to do schedule whole block/sector transfers, rather than re-writing a single block twice--once for the first half, a second time for the second half. (Modern kernels may handle this much better than a decade ago, however.)