caravagnalab/nextflow_modules

Error while mounting singularity images for Nextflow pipelines: "failed to create file /image/root/etc/issue, because No space left on device"

Closed this issue · 3 comments

Define how to make Nextflow point to a specific folder when mounting singularity images in order to avoid filling the /tmp node folders

I am trying to run Sarek to perform tumor-only variant calling on some whole exome sequencing samples. Every time I launch the pipeline, it gets stuck (I have tried to perform it by either adding the --resume flag and not) at different steps, with the following error output:

Command exit status:
 255
Command output:
 (empty)

Command error:`
 INFO:    Converting SIF file to temporary sandbox...
 FATAL:   while handling /orfeo/cephfs/scratch/area/vgazziero/nextflow/singularity/sarek/depot.galaxyproject.org-singularity-fastp-0.23.4--h5f740d0_0.img: while extracting image: root filesystem extraction failed: extract command failed: WARNING: passwd file doesn't exist in container, not updating
 WARNING: group file doesn't exist in container, not updating`
 WARNING: Skipping mount /etc/hosts [binds]: /etc/hosts doesn't exist in container
 WARNING: Skipping mount /etc/localtime [binds]: /etc/localtime doesn't exist in container
 WARNING: Skipping mount proc [kernel]: /proc doesn't exist in container
 WARNING: Skipping mount /orfeo/cephfs/opt/programs/intel/fedora37/singularity/3.10.4/var/singularity/mnt/session/tmp [tmp]: /tmp doesn't exist in container
 WARNING: Skipping mount /orfeo/cephfs/opt/programs/intel/fedora37/singularity/3.10.4/var/singularity/mnt/session/var/tmp [tmp]: /var/tmp doesn't exist in container
 WARNING: Skipping mount /orfeo/cephfs/opt/programs/intel/fedora37/singularity/3.10.4/var/singularity/mnt/session/etc/resolv.conf [files]: /etc/resolv.conf doesn't exist in container

 FATAL ERROR: write_file: failed to create file /image/root/etc/issue, because No space left on device
 Parallel unsquashfs: Using 36 processors
 693 inodes (526 blocks) to write

 : exit status 1

Work dir:
 /orfeo/cephfs/scratch/area/vgazziero/SarekPipelinetmp/wd_all_48_noP14/88/a8c53eb7b5b62c63a5d8d3629c913e

Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`

-- Check '.nextflow.log' file for details

The error arises due to singularity, since the sytem appears to be able to download the images from galaxyproject.org, as default indicated by the pipeline itself, but fails in mounting them in the cluster. In particular, for every step the pipeline allocates a computing node with specific resources, as indicated in the configuration files, and tries to mount singularity images, which are downloaded in the -work-dir in the /tmp directory of the node itself, resulting then in the error.

UPDATE: it is a problem related to the cluster itself. currently working on it

possible solutions (also might be combined) on which we are currently working:

  • trying to understand how to make singularity through nextflow configuration file point to a specific directory when mouting the containers and not to the root /tmp folder
  • run nextflow with the cleanup flag so that the inodes of the folder will not run out
  • take note of the cluster nodes that show this behaviour

One way to specify the tmp folder used by singularity is to specify the following in the nextflow_config_slurm file:

singularity {
    enabled = true
    singularity.cacheDir = "path/to/cache"
    singularity.envWhitelist = ["SINGULARITY_TMPDIR=path/to/tmp"]
    runOptions = '--bind /orfeo:/orfeo --bind $SINGULARITY_TMPDIR:/tmp'
}

where path/to/cache and path/to/tmp are user defined paths to cache and tmp locations.
path/to/tmp should ideally point to /tmp.

At the moment, the tmp folder cannot be created in /. For some reason it is not exported on the working nodes.
For the moment, just create a folder /orfeo/LTS/CDSLab/LT_storage/<username>/tmp and add the environment variable declaration in export SINGULARITY_TMPDIR="/orfeo/LTS/CDSLab/LT_storage/<username>/tmp in your ~/.bashrc.