sambamba markdup too many open files issue
Closed this issue · 7 comments
sambamba-markdup: Cannot open or create file '/foo/bar/sambamba-pid13785-hdac/sorted.86.bam' : Too many open files
Any ideas how to overcome this issue?
sambamba --help
sambamba v0.5.5
Usage: sambamba [command] [args...]
Available commands: 'view', 'index', 'merge', 'sort',
'flagstat', 'slice', 'markdup', 'depth', 'mpileup'
To get help on a particular command, just call it without args.
Leave bug reports and feature requests at
https://github.com/lomereiter/sambamba/issues
try --overflow-list-size 600000
Agreed, enlarging the --overflow-list-size
also worked for me. Alternatively you can also look into your system's "open files limits", for example in /etc/security/limits.conf
A simple idea would be to merge intermediate files once there are too many of them. That's on my to-do list.
still got too many open files error after using "--overflow-list-size 600000" in a large batch sample running.
can the number in "600000" be increased. what is the upper threshold? or is there any limits related to this count? e.g. memory size
@wuyilei: as it is stated in the linked nextflow issue, setting ulimit
to unlimited should have solved the problem, so I consider this strange as well.
There is no upper threshold, and another suggestion is to increase --hash-table-size
as well. This naturally increases memory consumption linearly. (And if you run into issues with memory, please check out and test if markdup-extsort
branch works for you.)
Isn't popFrontOverflowList
leaking files when a paired read is found in the overflow list? In that case closeTmpWriter
is not called. I'm guessing the std.stdio.File
for the index will be automatically cleaned up, but I don't think the BamWriter
will be? Does GC get that?
Potentially also readsFromTempFiles
- not clear to me what can be counted on from the runtime.
Inactive