huangnengCSU/compleasm

arthropoda_odb10 same BUSCOs for multiple assemblies

katiecdillon opened this issue · 4 comments

Hello,

I am running this script:
SCRIPT_NCBI_quast_compleasm_20240413_v1.txt

And getting this output:
quast_compleasm_27733587.txt

In short, for every different assembly I am getting the exact same BUSCO scores.

Please let me know how I can best resolve this.

Thank you,

Katie Dillon

Hi Katie,
Are the two for-loops in the submit script correct? I guess you have 36 assemblies to be evaluated, but in your submit script, there seems to be 36x36 jobs in total. Besides the two for-loops, I didn't find other problems. Just let me know if there are the same problems after you update the script.

Hello,

So I had also taken three of the assemblies out of the loops and ran them individually and they all had the same BUSCO scores.

SCRIPT_NCBI_quast_compleasm_20240414_v1.txt

I also ran the same script but used module load instead of the singularity and got the same output.

I tested these three assemblies, and here are the commands and results.

compleasm run -a /hlilab/neng/data/genomes/Amblyomma_americanum/ncbi_dataset/data/GCA_030143305.1/GCA_030143305.1_ASM3014330v1_genomic.fna -o A_americanum_o -l arthropoda_odb10 -L ../mb_downloads/ -t 10
S:77.00%, 780
D:12.54%, 127
F:6.02%, 61
I:0.00%, 0
M:4.44%, 45
N:1013

compleasm run -a /hlilab/neng/data/genomes/Dermacentor_andersoni/ncbi_dataset/data/GCA_023375885.2/GCA_023375885.2_qqDerAnde1.2_genomic.fna -o D_andersoni01_o -l arthropoda_odb10 -L ../mb_downloads/ -t 10
S:95.85%, 971
D:1.97%, 20
F:1.18%, 12
I:0.00%, 0
M:0.99%, 10
N:1013

compleasm run -a /hlilab/neng/data/genomes/Amblyomma_maculatum/ncbi_dataset/data/GCA_023969395.1/GCA_023969395.1_ASM2396939v1_genomic.fna -o A_maculatum_o -l arthropoda_odb10 -L ../mb_downloads/ -t 10
S:92.69%, 939
D:1.88%, 19
F:3.36%, 34
I:0.10%, 1
M:1.97%, 20
N:1013

I also tested using singularity like:

singularity pull docker://huangnengcsu/compleasm:v0.2.6
singularity exec -B /hlilab/neng/data/genomes/:/assembly,/hlilab/neng/projs/proj-busco/compleasm_dir/mb_downloads:/mb_download,/hlilab/neng/projs/proj-busco/compleasm_dir/Issue_35:/output compleasm_v0.2.6.sif compleasm run -a /assembly/Dermacentor_andersoni/ncbi_dataset/data/GCA_023375885.2/GCA_023375885.2_qqDerAnde1.2_genomic.fna -o /output/D_andersoni01_o -l arthropoda_odb10 -L /mb_download -t 8
S:95.85%, 971
D:1.97%, 20
F:1.18%, 12
I:0.00%, 0
M:0.99%, 10
N:1013

I guess your problem is caused by singularity usage or unclean directory. First, delete all the output folders. Second, when using singularity, mount the required directory.

Ah I see my problem. I thought the -o flag meant I needed to include an output directory but I guess it just needs the name of a directory, not necessarily its path. Thank you again for your help!