HPC Server Issues
Closed this issue · 9 comments
Hi Mantas, I really appreciate the work that has gone into the mmlong2 pipeline. This past spring I had it working until my university HPC server went down, and reinstalling mmlong2 is causing some issues. My installation code is below:
mamba create -n mmlong2 -c conda-forge -c bioconda snakemake=7.26.0 singularity=3.8.6 zenodo_get=1.3.4 pv=1.6.6 pigz=2.6 tar=1.34 -y
followed by
mamba activate mmlong2 || source activate mmlong2 && zenodo_get -r 8027235 -o mmlong2/bin
and just as in the readme
pv mmlong2/bin/sing-mmlong2-lite-*.tar.gz | pigz -dc - | tar xf - -C mmlong2/bin/.
pv mmlong2/bin/sing-mmlong2-proc-*.tar.gz | pigz -dc - | tar xf - -C mmlong2/bin/.
chmod +x mmlong2/bin/mmlong2
I then try to run mmlong2 -h
and get the error message:
grep: /mnt/home/marshaag/mambaforge/envs/mmlong2/bin/mmlong2-proc-config.yaml: No such file or directory
Leading me to look into my mambaforge bin and it doesn't contain the lite or proc .yaml files or the lite or proc .smk files. I tried cloning the repository and copying those files into my mambaforge mmlong2 bin and that led to the pipeline starting but failing at random steps that did not have issues prior with the same input and arguments. I believe the inconsistent failing at certain steps is just due to my broken installation but you probably know a bit more about those intricacies than I do.
I'm pretty new to snakemake pipelines so any help getting this working again would be greatly appreciated!
Hey Austin, thank you for your interest in the workflow.
From the description, I would assume that zenodo_get -r 8027235 -o mmlong2/bin
command could not download all the necessary files and a clean re-install would probably fix this.
Could you also post the error messages? It would help a lot with the troubleshooting.
Hi Mantas, sorry for getting back kinda late.
The error from the script is quite long so I included most in the text file along with a snakemake log of the same error on a different run:
2023-06-21T110842.028534.snakemake.log
I believe it is just the zenodo_get command not working correctly because I do remember that loading but had 0 Mb of files grabbed for both the .yaml and .smk files. I'll attempt to upload those missing zenodo files to my HPC and see if that fixes the issue.
Hey again, sorry for the late reply.
Did the re-installation help?
I've went through the error logs and unfortunately, it's not something I have encountered before.
One common cause for random workflow crashes is having a temporary directory with a capped file size or count limit.
Creating your own directory specifically for the temporary files and providing it to the workflow with the -tmp
usually fixes this.
Hi Mantas,
I did the re-install as well as started using the -tmp function but I'm still running into the same error, around the medaka polishing step of flye. I think it has to do with the OSError: [Errno 39] Directory not empty: 'variables'
line of the error script and was wondering if you knew anything about this folder being created in the conda env from the pipeline itself?
Sadly, I have not experienced this error.
My best guess is that it is either something specific with your universities HPC set-up or you are using the working/temporary directories without having the full user permissions to them.
We have had some hiccups with our HPC recently so it most probably is specific to our own server. I have my admin working on it currently and really appreciate all the help and time you've given to addressing this issue! Closing the issue now and if you guys are coming to the NCM in Houston this December would love to meet up.
Hi Mantas,
Sorry to open this back up again but after some more digging I came upon this issue in the medaka github which is the same error that I am running into leading medaka consensus to not complete. It's not a huge issue for medaka by itself since they were able to reexecute the command allowing it to finish, but with the mmlong2 pipeline it cleans up the directory when resuming this step and won't finish. I watched the pipeline to see exactly where it fails and it never reaches the medaka stitch command shown in the bottom of the snakemake log above. I went through and completed the install exactly how its mentioned (not with the mamba implementation like above) with the same issue still persisting.
I'm almost positive its got something to do with medaka but in the log file above I changed the line if [ Nanopore-simplex == "PacBio-HiFi" ]; then cp pat_water_3_R10_BC01/tmp/flye/assembly.fasta pat_water_3_R10_BC01/tmp/polishing/asm_pol.fasta; else
so Nanopore-simplex == "Nanopore-simplex"
allowing the script to proceed past this error and failed at the binning_prep1 step later on. I know this is not correct since my inputs are nanopore reads but just wanted to see where this would take me.
I would really appreciate some help in getting this running as I'm depending on it pretty heavily to finish up my PhD.
In Thanks,
Austin
Hey again Austin,
Thanks for the follow up!
I've read through the GitHub comments and, if I understand it correctly, the issue is with a dependency for Medaka that fails to clean up the temporary files. This would explain why I am unable to reproduce the problem, as I just use the main server storage for the temp files.
It's odd that even with the -tmp
option you still get this problem. Maybe the temporary file location is somehow set differently on your server?
Anyways, if you really need to get your MAGs fast, there's several alternatives:
- If your Nanopore data was generated in 4 kHz mode, just use v0.9.1 of the workflow.
- If the newer Medaka versions fixed this issue and you can get someone to help you with containers and environments, you can install a local Conda environment in the container with a newer Medaka version and use that with the workflow.
- You can try a different MAG recovery workflow for Nanopore data, like Aviary (https://github.com/rhysnewell/aviary)
Best of luck!
Hi Mantas,
Thank you very much for dealing with all this, I'll try to stick with mmlong2 since I really like how simple and quick the pipeline runs but will certainly take a closer look at Aviary. Last time I looked at it I thought it was illumina only and hybrid assembly but it looks like there has been some updates. Coming from a nextflow background, snakemake is a little hard for me to comprehend as of yet but all the help has been very much appreciated!