Piano and WebGestalt
m-jahani opened this issue · 4 comments
The cluster system on which I am running RNAflow only has internet access on the main node. Consequently, when I executed the job using Slurm, both the piano and WebGestalt steps were skipped. In an attempt to resume the process locally (as opposed to using Slurm, which runs on the head node with internet access) after the Slurm run was completed, I encountered an issue. The pipeline started reanalyzing all the steps that had already been completed, whereas I only wanted it to run the skipped steps (piano and WebGestalt). What is the best way to handle this situation and ensure that only the piano and WebGestalt steps are executed in the local run?
Thanks
Hi @m-jahani ,
This should be doable with an injected configuration. First, create a simple text file, let's say my.config
. In there, we set the executor only for those two processes to local
:
process {
withName: webgestalt {
executor = "local"
}
withName: piano {
executor = "local"
}
}
This configuration can then be injected by adding it with -c
(see docs):
nextflow run [...] -c my.config
Let me know if it works!
I tried it. The intention was to run everything on Slurm except for the piano
and WebGestalt
steps. However, it did not work as expected, and it failed at the two specified steps.
nextflow run hoelzer-lab/rnaflow -c /home/mjahani/scratch/NEW_TMP/clean_slurm_test3/my.config -profile slurm,conda,latency --skip_sortmerna \
--reads /home/mjahani/scratch/files/test_tmp_clean.csv \
--genome /home/mjahani/scratch/files/fastas.csv \
--annotation /home/mjahani/scratch/files/gtfs.csv \
--permanentCacheDir /home/mjahani/scratch/conda_dataset/nextflow-autodownload-databases \
--condaCacheDir /home/mjahani/scratch/conda_dataset/conda \
--pathway hsa
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
Output path: results
Strandedness unstranded
Read mode: paired-end
TPM threshold: 1
Comparisons: all
Nanopore mode: false
executor > slurm (166)
[e4/d10372] process > concat_genome [100%] 1 of 1 ✔
[ca/8f7b4a] process > concat_annotation [100%] 1 of 1 ✔
[60/541159] process > preprocess_illumina:fastqcPre (43896327) [100%] 26 of 26 ✔
[d3/766a42] process > preprocess_illumina:fastp (44476765) [100%] 26 of 26 ✔
[f6/8fd925] process > preprocess_illumina:fastqcPost (44172717) [100%] 26 of 26 ✔
[33/16a324] process > preprocess_illumina:hisat2index [100%] 1 of 1 ✔
[e7/0fc4c8] process > preprocess_illumina:hisat2 (44009528) [100%] 26 of 26 ✔
[97/7c4ddb] process > preprocess_illumina:index_bam (44009528) [100%] 26 of 26 ✔
[df/b69117] process > expression_reference_based:featurecounts (44009528) [100%] 26 of 26 ✔
[33/f26765] process > expression_reference_based:format_annotation_gene_rows [100%] 1 of 1 ✔
[26/ef2bb2] process > expression_reference_based:format_annotation [100%] 1 of 1 ✔
[9b/36f412] process > expression_reference_based:tpm_filter [100%] 1 of 1 ✔
[4d/450fc1] process > expression_reference_based:deseq2 (1) [100%] 2 of 2, failed: 1, retries: 1 ✔
[7d/6c39b9] process > expression_reference_based:piano (0_vs_1) [100%] 1 of 1, failed: 1
[- ] process > expression_reference_based:webgestalt [ 0%] 0 of 1
[d0/c9ccdf] process > expression_reference_based:multiqc_sample_names (1) [100%] 1 of 1 ✔
[56/f9bb66] process > expression_reference_based:multiqc (1) [100%] 1 of 1 ✔
ERROR ~ Error executing process > 'expression_reference_based:piano (0_vs_1)'
Caused by:
Process requirement exceeds available CPUs -- req: 24; avail: 6
Command executed:
R CMD BATCH --no-save --no-restore '--args c(".") c("deseq2_0_vs_1_filtered_padj_0.05.csv") c("hsa") c("ensembl_gene_id") c("24")' piano.R
Command exit status:
-
Command output:
(empty)
Work dir:
/lustre07/scratch/mjahani/NEW_TMP/clean_slurm_test3/work/7d/6c39b921ba00746d013403294f4069
Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`
-- Check '.nextflow.log' file for details
Okay, Piano
seems to request too many CPUs
Caused by:
Process requirement exceeds available CPUs -- req: 24; avail: 6
You can change that in the config snippet accordingly:
process {
withName: webgestalt {
executor = "local"
}
withName: piano {
executor = "local"
cpus = 4
memory = { 4.GB * task.attempt }
}
}
Thanks a bunch, that was super helpful!