metagenome-atlas/atlas

Error in rule combine_gene_coverages

Waschina opened this issue · 4 comments

  • I checked and didn't found a related issue,e.g. while typing the title
  • ** I got an error in the following rule(s):** combine_gene_coverages
  • I checked the log files indicated indicated in the error message (and the cluster logs if submitted to a cluster)

Here is the relevant log output:

localrule combine_gene_coverages:
    input: Genecatalog/alignments/I12311_coverage.tsv, Genecatalog/alignments/I12312_coverage.tsv, Genecatalog/alignments/I12313_coverage.tsv, Genecatalog/alignments/I12314_coverage.tsv, Genecatalog/alignments/I12315_coverage.tsv, Genecatalog/alignments/I12316_coverage.tsv, Genecatalog/alignments/I12317_coverage.tsv, Genecatalog/alignments/I12318_coverage.tsv, Genecatalog/alignments/I12319_coverage.tsv, Genecatalog/alignments/I12320_coverage.tsv, Genecatalog/alignments/I12321_coverage.tsv, Genecatalog/alignments/I12322_coverage.tsv, Genecatalog/alignments/I12323_coverage.tsv, Genecatalog/alignments/I12324_coverage.tsv, Genecatalog/alignments/I12325_coverage.tsv, Genecatalog/alignments/I12326_coverage.tsv, Genecatalog/alignments/I12327_coverage.tsv, Genecatalog/alignments/I12328_coverage.tsv, Genecatalog/alignments/I12329_coverage.tsv, Genecatalog/alignments/I12330_coverage.tsv, Genecatalog/alignments/I12331_coverage.tsv, Genecatalog/alignments/I12332_coverage.tsv, Genecatalog/alignments/I12333_coverage.tsv, Genecatalog/alignments/I12334_coverage.tsv, Genecatalog/alignments/I12335_coverage.tsv, Genecatalog/alignments/I12336_coverage.tsv, Genecatalog/alignments/I12337_coverage.tsv, Genecatalog/alignments/I12338_coverage.tsv, Genecatalog/alignments/I12339_coverage.tsv, Genecatalog/alignments/I12340_coverage.tsv, Genecatalog/alignments/I12341_coverage.tsv, Genecatalog/alignments/I12342_coverage.tsv, Genecatalog/alignments/I12343_coverage.tsv, Genecatalog/alignments/I12344_coverage.tsv, Genecatalog/alignments/I12345_coverage.tsv, Genecatalog/alignments/I12346_coverage.tsv, Genecatalog/alignments/I12347_coverage.tsv, Genecatalog/alignments/I12348_coverage.tsv, Genecatalog/alignments/I12349_coverage.tsv, Genecatalog/alignments/I12350_coverage.tsv, Genecatalog/alignments/I12351_coverage.tsv, Genecatalog/alignments/I12352_coverage.tsv, Genecatalog/alignments/I12353_coverage.tsv, Genecatalog/alignments/I12354_coverage.tsv, Genecatalog/alignments/I12355_coverage.tsv, Genecatalog/alignments/I12356_coverage.tsv, Genecatalog/alignments/I12357_coverage.tsv, Genecatalog/alignments/I12358_coverage.tsv, Genecatalog/alignments/I12359_coverage.tsv, Genecatalog/alignments/I12360_coverage.tsv, Genecatalog/alignments/I12361_coverage.tsv, Genecatalog/alignments/I12362_coverage.tsv, Genecatalog/alignments/I12363_coverage.tsv, Genecatalog/alignments/I12364_coverage.tsv, Genecatalog/alignments/I12365_coverage.tsv, Genecatalog/alignments/I12366_coverage.tsv, Genecatalog/alignments/I12367_coverage.tsv, Genecatalog/alignments/I12368_coverage.tsv, Genecatalog/alignments/I12369_coverage.tsv, Genecatalog/alignments/I12370_coverage.tsv, Genecatalog/alignments/I12371_coverage.tsv, Genecatalog/alignments/I12372_coverage.tsv, Genecatalog/alignments/I12373_coverage.tsv, Genecatalog/alignments/I12374_coverage.tsv, Genecatalog/alignments/I12375_coverage.tsv, Genecatalog/alignments/I12376_coverage.tsv, Genecatalog/alignments/I12377_coverage.tsv, Genecatalog/alignments/I12378_coverage.tsv, Genecatalog/alignments/I12379_coverage.tsv, Genecatalog/alignments/I12380_coverage.tsv, Genecatalog/alignments/I12381_coverage.tsv, Genecatalog/alignments/I12382_coverage.tsv, Genecatalog/alignments/I12383_coverage.tsv, Genecatalog/alignments/I12384_coverage.tsv, Genecatalog/alignments/I12385_coverage.tsv, Genecatalog/alignments/I12386_coverage.tsv, Genecatalog/alignments/I12387_coverage.tsv, Genecatalog/alignments/I12388_coverage.tsv, Genecatalog/alignments/I12389_coverage.tsv, Genecatalog/alignments/I12390_coverage.tsv, Genecatalog/alignments/I12391_coverage.tsv, Genecatalog/alignments/I12392_coverage.tsv, Genecatalog/alignments/I12393_coverage.tsv, Genecatalog/alignments/I12394_coverage.tsv, Genecatalog/alignments/I12395_coverage.tsv, Genecatalog/alignments/I12396_coverage.tsv, Genecatalog/alignments/I12397_coverage.tsv, Genecatalog/alignments/I12398_coverage.tsv, Genecatalog/alignments/I12399_coverage.tsv, Genecatalog/alignments/I12400_coverage.tsv, Genecatalog/alignments/I12401_coverage.tsv, Genecatalog/alignments/I12402_coverage.tsv, Genecatalog/alignments/I12403_coverage.tsv, Genecatalog/alignments/I12404_coverage.tsv, Genecatalog/alignments/I12405_coverage.tsv, Genecatalog/alignments/I12406_coverage.tsv, Genecatalog/alignments/I12407_coverage.tsv, Genecatalog/alignments/I12408_coverage.tsv, Genecatalog/alignments/I12409_coverage.tsv, Genecatalog/alignments/I12410_coverage.tsv, Genecatalog/alignments/I12411_coverage.tsv, Genecatalog/alignments/I12412_coverage.tsv, Genecatalog/alignments/I12413_coverage.tsv, Genecatalog/alignments/I12414_coverage.tsv, Genecatalog/alignments/I12415_coverage.tsv, Genecatalog/alignments/I12416_coverage.tsv, Genecatalog/alignments/I12417_coverage.tsv, Genecatalog/alignments/I12418_coverage.tsv, Genecatalog/alignments/I12419_coverage.tsv, Genecatalog/alignments/I12420_coverage.tsv, Genecatalog/alignments/I12421_coverage.tsv, Genecatalog/alignments/I12422_coverage.tsv, Genecatalog/alignments/I12423_coverage.tsv, Genecatalog/alignments/I12424_coverage.tsv, Genecatalog/alignments/I12425_coverage.tsv, Genecatalog/alignments/I12426_coverage.tsv, Genecatalog/alignments/I12427_coverage.tsv, Genecatalog/alignments/I12428_coverage.tsv, Genecatalog/alignments/I12429_coverage.tsv, Genecatalog/alignments/I12430_coverage.tsv, Genecatalog/alignments/I12431_coverage.tsv, Genecatalog/alignments/I12432_coverage.tsv, Genecatalog/alignments/I12433_coverage.tsv, Genecatalog/alignments/I12434_coverage.tsv, Genecatalog/alignments/I12435_coverage.tsv, Genecatalog/alignments/I12436_coverage.tsv, Genecatalog/alignments/I12437_coverage.tsv, Genecatalog/alignments/I12438_coverage.tsv, Genecatalog/alignments/I12439_coverage.tsv, Genecatalog/alignments/I12440_coverage.tsv, Genecatalog/alignments/I12441_coverage.tsv, Genecatalog/alignments/I12442_coverage.tsv, Genecatalog/alignments/I12443_coverage.tsv, Genecatalog/alignments/I12444_coverage.tsv, Genecatalog/alignments/I12445_coverage.tsv, Genecatalog/alignments/I12446_coverage.tsv, Genecatalog/alignments/I12447_coverage.tsv, Genecatalog/alignments/I12448_coverage.tsv, Genecatalog/alignments/I12449_coverage.tsv, Genecatalog/alignments/I12450_coverage.tsv, Genecatalog/alignments/I12451_coverage.tsv, Genecatalog/alignments/I12452_coverage.tsv, Genecatalog/alignments/I12453_coverage.tsv, Genecatalog/alignments/I12454_coverage.tsv, Genecatalog/alignments/I12455_coverage.tsv, Genecatalog/alignments/I12456_coverage.tsv, Genecatalog/alignments/I12457_coverage.tsv, Genecatalog/alignments/I12458_coverage.tsv, Genecatalog/alignments/I12459_coverage.tsv, Genecatalog/alignments/I12460_coverage.tsv, Genecatalog/alignments/I12461_coverage.tsv, Genecatalog/alignments/I12462_coverage.tsv, Genecatalog/alignments/I12463_coverage.tsv, Genecatalog/alignments/I12464_coverage.tsv, Genecatalog/alignments/I12465_coverage.tsv, Genecatalog/alignments/I12466_coverage.tsv, Genecatalog/alignments/I12467_coverage.tsv, Genecatalog/alignments/I12468_coverage.tsv, Genecatalog/alignments/I12469_coverage.tsv, Genecatalog/alignments/I12470_coverage.tsv, Genecatalog/alignments/I12471_coverage.tsv, Genecatalog/alignments/I12472_coverage.tsv, Genecatalog/alignments/I12473_coverage.tsv, Genecatalog/alignments/I12474_coverage.tsv, Genecatalog/alignments/I12475_coverage.tsv, Genecatalog/alignments/I12476_coverage.tsv, Genecatalog/alignments/I12477_coverage.tsv, Genecatalog/alignments/I12478_coverage.tsv, Genecatalog/alignments/I12479_coverage.tsv, Genecatalog/alignments/I12480_coverage.tsv, Genecatalog/alignments/I12481_coverage.tsv, Genecatalog/alignments/I12482_coverage.tsv, Genecatalog/alignments/I12483_coverage.tsv, Genecatalog/alignments/I12484_coverage.tsv, Genecatalog/alignments/I12485_coverage.tsv, Genecatalog/alignments/I12486_coverage.tsv, Genecatalog/alignments/I12487_coverage.tsv, Genecatalog/alignments/I12488_coverage.tsv, Genecatalog/alignments/I12489_coverage.tsv, Genecatalog/alignments/I12490_coverage.tsv, Genecatalog/alignments/I12491_coverage.tsv, Genecatalog/alignments/I12492_coverage.tsv, Genecatalog/alignments/I12493_coverage.tsv, Genecatalog/alignments/I12494_coverage.tsv, Genecatalog/alignments/I12495_coverage.tsv, Genecatalog/alignments/I12496_coverage.tsv, Genecatalog/alignments/I12497_coverage.tsv, Genecatalog/alignments/I12498_coverage.tsv, Genecatalog/alignments/I12499_coverage.tsv, Genecatalog/alignments/I12500_coverage.tsv
    output: Genecatalog/counts/median_coverage.tsv.gz, Genecatalog/counts/Nmapped_reads.tsv.gz
    jobid: 8976
    resources: tmpdir=/tmp, mem=60, time=5, mem_mb=60000, time_min=300

/bin/sh: line 10:  2300 Killed                  /home/suahn360/.conda/envs/sw/envs/atlasenv/bin/python3.8 -m snakemake Genecatalog/counts/median_coverage.tsv.gz --snakefile /home/suahn360/.conda/envs/sw/envs/atlasenv/lib/python3.8/site-packages/atlas/workflow/Snakefile --force --cores 1 --keep-target-files --keep-remote --attempt 4 --scheduler greedy --force-use-threads --wrapper-prefix https://github.com/snakemake/snakemake-wrappers/raw/ --max-inventory-time 0 --ignore-incomplete --latency-wait 20 --default-resources "tmpdir=system_tmpdir" --directory /work_beegfs/suahn360/2021/atlas_DZHK --configfiles /work_beegfs/suahn360/2021/atlas_DZHK/config.yaml --allowed-rules combine_gene_coverages --notemp --quiet --no-hooks --nolock --mode 1 --use-conda --conda-prefix /work_beegfs/suahn360/2021/atlas_test/databases/conda_envs --conda-base-path /home/suahn360/.conda/envs/sw/envs/atlasenv
Trying to restart job 8976.
[...]

I've set --restart-times to 4, but in all four instances the process is killed.

** Atlas version**: 2.8.1

Additional context
I ran ATLAS on a cluster system. Yet the above error in the combine_gene_coverages does not reach the stage of being submitted as job to a cluster node, which is why there is no cluster log. I tracked the resources on the head node and the respective process (starting with [...]/atlasenv/bin/python3.8 -m snakemake Genecatalog/counts/median_coverage.tsv.gz [...] is running for ~15 minutes while the used memory is steadily increasing until it reaches the head node's limits (126 Gb) which is probably why the process is being killed. My guess is that the combined size of the input files (each <sampleID>_coverage.tsv, is around 450 Mb) is too big since there are 190 samples I'm trying to handle with ATLAS.
Does anyone has a suggestion why this error occurs and/or how to resolve it?

Let me add one more thing: ATLAS is fantastic work - thank you for the development!

Silvio

Sorry, for the delayed response over the year-end break.

Probably I need to find a more efficient way to perform the merging of this table.

But if you want a quick fix on your system. You could install atlas in from the Github and then I guide you to do some modifications, which will run this job on the cluster where you can have more than 100gb ram. Do you want to do the quick fix?

no worries at all.
Yes, the 'quick fix' sounds great :) I was already looking where in your scripts I could make the change to submit the task to a cluster node, but couldn't find it for sure. If you could guide me though the required modifications, that would be super helpful.

I have now installed metagenome-atlas' dev version 2.8.1+16.gc78f50e

Thanks so much!

If you remove the

    combine_gene_coverages,```

in line 306 of the genecatalog.smk the rule get submitted to the cluster.

Hi @SilasK

The fix worked, thank you!

Just in case anyone else experiences this issue: I removed the lines

localrules:
combine_gene_coverages,

and needed to increase the default memory in the config.yaml:

mem: 250

The required memory probably depends on the gene catalogue size I presume. In my case, there were 192 samples and ~6.5 million sequences in Genecatalog/gene_catalog.faa and memory usage peaked at 152 GB.