Excessive CPU and memory use + thread panics
pkolaczk opened this issue · 11 comments
Tried this on my home directory with ~1.5 million files.
/usr/bin/time -v target/release/ddh ~
thread '<unnamed>' panicked at 'Error reading Symlink Metadata: Os { code: 2, kind: NotFound, message: "No such file or directory" }', src/lib.rs:285:25
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread '<unnamed>' panicked at 'Error reading Symlink Metadata: Os { code: 2, kind: NotFound, message: "No such file or directory" }', src/lib.rs:285:25
thread '<unnamed>' panicked at 'Error reading Symlink Metadata: Os { code: 2, kind: NotFound, message: "No such file or directory" }', src/lib.rs:285:25
thread '<unnamed>' panicked at 'Error reading Symlink Metadata: Os { code: 2, kind: NotFound, message: "No such file or directory" }', src/lib.rs:285:25
thread '<unnamed>' panicked at 'Error reading Symlink Metadata: Os { code: 2, kind: NotFound, message: "No such file or directory" }', src/lib.rs:285:25
thread '<unnamed>' panicked at 'Error reading Symlink Metadata: Os { code: 2, kind: NotFound, message: "No such file or directory" }', src/lib.rs:285:25
thread '<unnamed>' panicked at 'Error reading Symlink Metadata: Os { code: 2, kind: NotFound, message: "No such file or directory" }', src/lib.rs:285:25
thread '<unnamed>' panicked at 'Error reading Symlink Metadata: Os { code: 2, kind: NotFound, message: "No such file or directory" }', src/lib.rs:285:25
thread '<unnamed>' panicked at 'Error reading Symlink Metadata: Os { code: 2, kind: NotFound, message: "No such file or directory" }', src/lib.rs:285:25
thread '<unnamed>' panicked at 'Error reading Symlink Metadata: Os { code: 2, kind: NotFound, message: "No such file or directory" }', src/lib.rs:285:25
thread '<unnamed>' panicked at 'Error reading Symlink Metadata: Os { code: 2, kind: NotFound, message: "No such file or directory" }', src/lib.rs:285:25
thread '<unnamed>' panicked at 'Error reading Symlink Metadata: Os { code: 2, kind: NotFound, message: "No such file or directory" }', src/lib.rs:285:25
thread '<unnamed>' panicked at 'Error reading Symlink Metadata: Os { code: 2, kind: NotFound, message: "No such file or directory" }', src/lib.rs:285:25
...
There seems to be no way to turn off symlink following, so I guess it might have escaped from my home dir and tried to scan the whole filesystem. But anyways, the CPU and memory usage was extremely high. System time usage is quite unusal assuming this was I/O bound workload:
^CCommand terminated by signal 2
Command being timed: "target/release/ddh /home/pkolaczk"
User time (seconds): 50.08
System time (seconds): 983.47
Percent of CPU this job got: 776%
Elapsed (wall clock) time (h:mm:ss or m:ss): 2:13.10
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 18724728
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 4688422
Voluntary context switches: 27301
Involuntary context switches: 150856
Swaps: 0
File system inputs: 103592
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
ddh
consumed 18 GBs, while other programs like
fdupes, jdupes, rmfind, fclones need about 150-500 MB.
Interesting. So, ddh shouldn't be following symlinks at all. That error comes from this bit of code
let good_entries: Vec<_> = read_dir_results
.filter(|x| x.is_ok())
.map(|x| x.unwrap())
.collect();
let (files, dirs): (Vec<&DirEntry>, Vec<&DirEntry>) = good_entries.par_iter().partition(|&x|
x.path()
.as_path()
.symlink_metadata()
.expect("Error reading Symlink Metadata")
.file_type()
.is_file()
);
Where directory entries are sorted as files and dirs prior to hashing. The symlink term is used to map the rust file api. I can definitely make a patch to better handle that error, but would you be willing to test some changes on a branch? I'd like to dig into what the root cause is if possible.
The debug-thread-file-panics
branch might resolve your issue due to a change in metadata access. If you can get a reproducible panic can you share what the type of filesystem entry it is?
Panics are not happening on the debug-thread-file-panics
branch, but excessive memory use is still present. It consumes tens of gigabytes very quickly, before even printing anything. Is there any progress bar, to at least know how far it went?
Ah, I was wrong: there was a panic, but I needed to wait longer:
thread '<unnamed>' panicked at 'Error reading path length: Os { code: 40, kind: Other, message: "Too many levels of symbolic links" }', src/lib.rs:270:17
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
I was able to reproduce the error locally and made a new commit on debug-thread-file-panics
which resolves the bug for me. Would you mind giving it another spin?
I was busy, I'll try to look at this on weekend.
I've merged the debug-thread-file-panics
branch into master as it corrects a bug. When you have a chance please let me know if that resolves your bug or if there is yet another :)
Closing. I believe the issue is resolved and the reporter has not followed up.