njaupan/ecc_finder

It need to much memory either RAM and ROM?

Opened this issue · 2 comments

I read your code of map-sr mode, I found that you used pybedtools and pandas to process the bam2bed file and found split and discordant reads in it. So if I have a very big data, It may cause crash !

my bed file is up to 21Gb, and my / directory only have 21Gb left. I noticed that when I running this pipeline, my RAM was used up to more than 80G,and tmp file was so big that the pipeline was shut down!

Can you give me any advise to solve this problems ?

Hi,
The problem came from "your/ dictionary only have 21G left" which you need to enlarge your computational storage.
Both of pybedtools and pandas don't need too much RAM but longer time for larger genome.

May I ask your raw data is sequencing data after enriched for eccDNA? The bam file (20G as you mentioned) is super high coverage and it seems like whole genome sequencing data.
Best

Hi, The problem came from "your/ dictionary only have 21G left" which you need to enlarge your computational storage. Both of pybedtools and pandas don't need too much RAM but longer time for larger genome.

May I ask your raw data is sequencing data after enriched for eccDNA? The bam file (20G as you mentioned) is super high coverage and it seems like whole genome sequencing data. Best

It is very difficult to modify the size of the mount / directory. I found that you used bedtools sort to sort the bed files. I suggest using GNU sort to sort. In this way, the memory and temporary files will be much smaller, and the temporary file path can be customized