How to run job arrays based on a function of TASK_ID?
Closed this issue · 2 comments
I initiallly asked this question in a private email to the author, whose response I would like to share here in case anyone is interested.
Here is my question:
I have a file named my.txt which is shown below:
My first bivariate GREML analysis will be specified as --reml-bivar 1 2, second analysis --reml-bivar 3 1, .... and the 6th --reml-bivar 4 3. I used the attached bash script to submit the jobs without success. I can figure out the issue is related to the specification of TASK_ID in the following to lines:
trait1_L1=$(awk -F ',' -v task=$((TASK_ID)) 'NR==task {print $1}' /storage/***/my.txt)
trait2_L1=$(awk -F ',' -v task=$((TASK_ID)) 'NR==task {print $2}' /storage/***/my.txt)
I did something similar in slurm script, which worked. But this did not work for qsubshcom script.
Here is the unsuccessful qsubshcom script:
#!/bin/bash
script_dir=$(dirname $(readlink -f $0))
logs_dir=${script_dir}/../../logs
results_dir=${script_dir}/../../results
grm_dir=/storage/***/grm
cd ${logs_dir}
trait1_L1=$(awk -F ',' -v task=$((TASK_ID)) 'NR==task {print $1}' /storage/***/my.txt)
trait2_L1=$(awk -F ',' -v task=$((TASK_ID)) 'NR==task {print $2}' /storage/***/my.txt)
command1="gcta \
--reml-bivar ${trait2_L1} ${trait1_L1} \
--reml-bivar-lrt-rg 0 \
--grm ${grm_dir}/GRM_mafgt0.5 \
--pheno ${results_dir}/my.phen \
--out ${results_dir}/$(echo "rg_T${trait2_L1}T${trait1_L1}")"
qsubshcom "$command1" 1 10G myjob 23:00:00 "-queue=***, -array=1-10"
Here is the successful slurm script:
#!/bin/bash
#SBATCH --job-name=myjob
#SBATCH --output=/storage/***/%x_%a.out
#SBATCH --error=/storage/***/%x_%a.err
#SBATCH --chdir=/storage/***
#SBATCH --array=1-10
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --mem=10G
trait1_L1=$(awk -F ',' -v task=$((SLURM_ARRAY_TASK_ID)) 'NR==task {print $1}' my.txt)
echo $trait1_L1
trait2_L1=$(awk -F ',' -v task=$((SLURM_ARRAY_TASK_ID)) 'NR==task {print $2}' my.txt)
echo $trait2_L1
outName="$(echo "rg_lev1_c${trait2_L1}VSc${trait1_L1}")"
gcta \
--reml-bivar ${trait2_L1} ${trait1_L1} \
--reml-bivar-lrt-rg 0 \
--grm ${grm_dir}/GRM_mafgt0.5 \
--pheno ${pheno_dir}/my.phen \
--out ${out_dir}/${outName}
how I could modify the qsubshcom script to get the job running?
Author response:
{TASK_ID} is a variable only existed on the remote worker node, so it can’t obtain its correct value from your local submitter.
Here is my revised script following the author's suggestions:
create a file named test.sh
which contains the following:
#!/bin/bash
script_dir=$(dirname $(readlink -f $0))
logs_dir=${script_dir}/../../logs/rgStrModEur
results_dir=${script_dir}/../../results/rgStrModEur
grm_dir=/storage/***/grm
cd ${logs_dir}
trait1_L1=$(awk -F ',' -v task=$((TASK_ID)) 'NR==task {print $1}' /storage/***/my.txt)
trait2_L1=$(awk -F ',' -v task=$((TASK_ID)) 'NR==task {print $2}' /storage/***/my.txt)
gcta \
--reml-bivar ${trait2_L1} ${trait1_L1} \
--reml-bivar-lrt-rg 0 \
--grm ${grm_dir}/GRM_mafgt0.5 \
--pheno ${results_dir}/my.phen \
--out ${results_dir}/$(echo "rg_T${trait2_L1}T${trait1_L1}"))
Then run the following:
qsubshcom "bash test.sh" 1 10G test 23:00:00 "-queue=*** -array=1-10"
My revised script worked fine. However, the job_reports folder and qsub log file are created in the directory where I was when I was submitting my jobs, rather than in the desired logs_dir. How could I request that the directory logs_dir defined above to hold job_reports folder and qsub log file when I entering qsubshcom "bash test.sh" 1 10G test 23:00:00 "-queue=*** -array=1-10"
manually in shell?
Hi @kcstringer ,
Thanks for posting here. I insist to put to issues instead of email: 1. public available; 2. the email address will expire if I change my work (my UQ email will expire in next week).
Glad to hear you almost solve issue. The cluster log is not important usually, so I put to fixed job_reports folder, and also a qsub.$(date).log to save all your command. I use this because easier for the users to find what has been run and the log are in the job_reports. This usually happens to me, I will try multple command, and forget the last command I run, so just look into the latest qsub.$(date).log, the histories are all there.
I will not add another flag to customize this, as the script is used across several group, the changing may annoy users. Hope this won't bother you much.
Regards,
Zhili
Thank you. This question is resolved and can be closed.