Repository to exemplify the issue when importing Python modules within Snakemake modules. This repository was created using Snakemake v8.14.0 and additionally tested with Snakemake v7.28.3. The text below refers to the main
branch. On the renamed_script_directory
branch, the workflow runs due to renaming of the scripts directory in the minimal_module/workflow
directory.
This repository has the following directory structure:
.
├── README.md
└── workflow
├── minimal_module
│ └── workflow
│ ├── scripts
│ │ ├── __init__.py
│ │ └── input_functions_module.py
│ └── Snakefile
├── scripts
│ ├── __init__.py
│ └── input_functions_main.py
└── Snakefile
5 directories, 7 files
There is a main workflow and a module workflow. The latter is stored within minimal_module
.
The main workflow creates an output file results/output.txt
, which requires the input results/intermediate.txt
.
The intermediate file is the output of minimal_module
.
The main workflow uses an input function defined inside input_functions_main.py
.
The module uses an input function defined inside input_functions_module.py
.
When executing the main workflow with snakemake -c1
, the following error is returned:
ModuleNotFoundError in file /home/esox/github/minimal_example_snakemake_module/workflow/minimal_module/workflow/Snakefile, line 1:
No module named 'scripts.input_functions_module'
File "/home/esox/github/minimal_example_snakemake_module/workflow/Snakefile", line 6, in <module>
File "/home/esox/github/minimal_example_snakemake_module/workflow/minimal_module/workflow/Snakefile", line 1, in <module>
In other words, the input function of the module cannot be imported.
When executing the module from within workflow/minimal_module
using snakemake -c1 results/intermediate.txt
, the
input function is found:
snakemake -c1 results/intermediate.txt -n
Building DAG of jobs...
Job stats:
job count
------------------- -------
create_input 1
create_intermediate 1
total 2
Execute 1 jobs...
[Wed Jun 12 10:36:57 2024]
rule create_input:
output: results/input.txt
jobid: 1
reason: Missing output files: results/input.txt
resources: tmpdir=<TBD>
Execute 1 jobs...
[Wed Jun 12 10:36:57 2024]
rule create_intermediate:
input: results/input.txt
output: results/intermediate.txt
jobid: 0
reason: Missing output files: results/intermediate.txt; Input files updated by another job: results/input.txt
resources: tmpdir=<TBD>
Job stats:
job count
------------------- -------
create_input 1
create_intermediate 1
total 2
Reasons:
(check individual jobs above for details)
input files updated by another job:
create_intermediate
output files have to be generated:
create_input, create_intermediate
This was a dry-run (flag -n). The order of jobs does not reflect the order of execution.