pinellolab/dictys

Suggestions to reduce storage ???

Closed this issue · 3 comments

image

When running this software at the above step, it consumes a significant amount of disk storage at this step.
Are there any parameters that can reduce the size of the generated files?

Thanks !!!

Hi bitcometz. Thank you for informing us of the challenge. Unfortunately the files are needed. One option is to use a custom google compute instance for colab. I don't know if colab pro still offers larger disks. Alternatively, is it possible to run the tutorial in conventional non-colab settings?

Hi, thanks for your quick reply !!!
Or dictys could offer a parameter just like chr list.
For example, if I just want to do one chromosome at one time to save the disk storage?
Or Should I first split the bam file into many chr files? ......

Thanks for developing this great tool !
However, its high resource requirements limit its usage, especially when dealing with large amounts of cellular data or multiple samples.

Actually 200GB disk does not seem a high resource requirement to me even for a personal laptop. Unfortunately bioinformatics is a wide field and we cannot optimize every aspect of bioinformatics with just one software. If you are willing to contribute with a more efficient script to split the bam files and post it on github, we are happy to refer to it for people with large datasets.