Nextomics/NextPolish2

nextpolish2 runs too slow

Han-Cao opened this issue · 8 comments

Hi,

I am polishing human genome assembly with ~30X HiFi reads and ~30X paired end illumina reads, using 30 CPU cores with 192GB RAM. The job has run ~15 hours. However, according to your paper, it only takes ~90 to 100 min with 5 CPU cores and 256GB RAM. Is it normal to run the job for such a long time?

Below is the code I used to prepare the input and run nextpolish2. The 15 hour is for nextpolish2 alone.

# minimap2 align to assembly (2.26-r1175)
minimap2 -ax map-hifi -Q -t $threads \
${asmpath}/${sample}.asm.bp.hap1.p_ctg.inspector.fa \
${lrspath}/${sample}.filt.fq.gz | samtools sort -o ${output_prefix}.hap1.bam - 
samtools index -@ $threads ${output_prefix}.hap1.bam


# prepare k-mer count (0.1-r69-dirty)
yak count -o ${output_prefix}.k21.yak -k 21 -b 37 -t $threads \
<(zcat ${srspath}/${sample}_1_paired.fq.gz) <(zcat ${srspath}/${sample}_2_paired.fq.gz) 
yak count -o ${output_prefix}.k31.yak -k 31 -b 37 -t $threads \
<(zcat ${srspath}/${sample}_1_paired.fq.gz) <(zcat ${srspath}/${sample}_2_paired.fq.gz) 

# run nextPolish2 (nextPolish2 0.2.0)
nextPolish2 -t $threads \
${output_prefix}.hap1.bam \
${asmpath}/${sample}.asm.bp.hap1.p_ctg.inspector.fa \
${output_prefix}.k21.yak \
${output_prefix}.k31.yak > ${output_prefix}.asm.bp.hap1.p_ctg.inspector.np2.fa

Not normal. Can you use top to check if the cpu time is increasing and use free to check if 192GB is enough?

I just re-run the job with the same code, the output from top and free are:

image

image

The screenshot was taken after only 2 minutes of running. It has not yet reached the bottleneck stage, so it is impossible to determine what the problem is. However, you can try to reduce the $threads to below 5 and rerun.

Actually, that was taken after running more than 1 hour. The CPU usage was very low, only consuming 1 single core. I will try reducing to a lower number of threads.

I have tried to used only 5 cores, and it is still very slow. The log from some jobs are:

Real time: 80856 sec; CPU: 204147 sec; Peak RSS: 55.330 GB
Real time: 85194 sec; CPU: 220292 sec; Peak RSS: 56.423 GB
Real time: 67118 sec; CPU: 172853 sec; Peak RSS: 48.021 GB

Here is my 4 logs when test to polish the CHM13 genome.

hostname
+ hostname
cd /path/02_np2/05_eva/chm13/asm.fa.polish/run1.sh.work/work1
+ cd /path/02_np2/05_eva/chm13/asm.fa.polish/run1.sh.work/work1
time /path/00_software/nextPolish2 -r -t 5 /path/02_np2/05_eva/chm13/asm.fa.polish/hifi.winnowmap.sort.bam /path/02_np2/05_eva/chm13/asm.fa.polish/asm.fa /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k21.yak /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k31.yak > /path/02_np2/05_eva/chm13/asm.fa.polish/asm.k21.31.winnowmap.all.fasta
+ /path/00_software/nextPolish2 -r -t 5 /path/02_np2/05_eva/chm13/asm.fa.polish/hifi.winnowmap.sort.bam /path/02_np2/05_eva/chm13/asm.fa.polish/asm.fa /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k21.yak /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k31.yak
CMD: /path/00_software/nextPolish2 -r -t 5 /path/02_np2/05_eva/chm13/asm.fa.polish/hifi.winnowmap.sort.bam /path/02_np2/05_eva/chm13/asm.fa.polish/asm.fa /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k21.yak /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k31.yak
Real time: 2555 sec; CPU: 11543 sec; Peak RSS: 71.768 GB


real	42m35.498s
user	121m48.883s
sys	70m35.746s
touch /path/02_np2/05_eva/chm13/asm.fa.polish/run1.sh.work/work1/subtask.sh.done
+ touch /path/02_np2/05_eva/chm13/asm.fa.polish/run1.sh.work/work1/subtask.sh.done

hostname
+ hostname
cd /path/02_np2/05_eva/chm13/asm.fa.polish/run1.sh.work/work2
+ cd /path/02_np2/05_eva/chm13/asm.fa.polish/run1.sh.work/work2
time /path/00_software/nextPolish2 -t 5 /path/02_np2/05_eva/chm13/asm.fa.polish/hifi.winnowmap.sort.bam /path/02_np2/05_eva/chm13/asm.fa.polish/asm.fa /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k21.yak /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k31.yak > /path/02_np2/05_eva/chm13/asm.fa.polish/asm.k21.31.winnowmap.fasta
+ /path/00_software/nextPolish2 -t 5 /path/02_np2/05_eva/chm13/asm.fa.polish/hifi.winnowmap.sort.bam /path/02_np2/05_eva/chm13/asm.fa.polish/asm.fa /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k21.yak /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k31.yak
CMD: /path/00_software/nextPolish2 -t 5 /path/02_np2/05_eva/chm13/asm.fa.polish/hifi.winnowmap.sort.bam /path/02_np2/05_eva/chm13/asm.fa.polish/asm.fa /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k21.yak /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k31.yak
Real time: 3822 sec; CPU: 12472 sec; Peak RSS: 73.847 GB


real	63m43.374s
user	128m31.153s
sys	79m21.651s
touch /path/02_np2/05_eva/chm13/asm.fa.polish/run1.sh.work/work2/subtask.sh.done
+ touch /path/02_np2/05_eva/chm13/asm.fa.polish/run1.sh.work/work2/subtask.sh.done

hostname
+ hostname
cd /path/02_np2/05_eva/chm13/asm.fa.polish/run1.sh.work/work3
+ cd /path/02_np2/05_eva/chm13/asm.fa.polish/run1.sh.work/work3
time /path/00_software/nextPolish2 -r -t 5 /path/02_np2/05_eva/chm13/asm.fa.polish/hifi.minimap2.sort.bam /path/02_np2/05_eva/chm13/asm.fa.polish/asm.fa /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k21.yak /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k31.yak > /path/02_np2/05_eva/chm13/asm.fa.polish/asm.k21.31.minimap2.all.fasta
+ /path/00_software/nextPolish2 -r -t 5 /path/02_np2/05_eva/chm13/asm.fa.polish/hifi.minimap2.sort.bam /path/02_np2/05_eva/chm13/asm.fa.polish/asm.fa /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k21.yak /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k31.yak
CMD: /path/00_software/nextPolish2 -r -t 5 /path/02_np2/05_eva/chm13/asm.fa.polish/hifi.minimap2.sort.bam /path/02_np2/05_eva/chm13/asm.fa.polish/asm.fa /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k21.yak /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k31.yak
Real time: 3694 sec; CPU: 12315 sec; Peak RSS: 69.862 GB


real	61m35.498s
user	125m48.392s
sys	79m28.054s
touch /path/02_np2/05_eva/chm13/asm.fa.polish/run1.sh.work/work3/subtask.sh.done
+ touch /path/02_np2/05_eva/chm13/asm.fa.polish/run1.sh.work/work3/subtask.sh.done

hostname
+ hostname
cd /path/02_np2/05_eva/chm13/asm.fa.polish/run1.sh.work/work4
+ cd /path/02_np2/05_eva/chm13/asm.fa.polish/run1.sh.work/work4
time /path/00_software/nextPolish2 -t 5 /path/02_np2/05_eva/chm13/asm.fa.polish/hifi.minimap2.sort.bam /path/02_np2/05_eva/chm13/asm.fa.polish/asm.fa /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k21.yak /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k31.yak > /path/02_np2/05_eva/chm13/asm.fa.polish/asm.k21.31.minimap2.fasta
+ /path/00_software/nextPolish2 -t 5 /path/02_np2/05_eva/chm13/asm.fa.polish/hifi.minimap2.sort.bam /path/02_np2/05_eva/chm13/asm.fa.polish/asm.fa /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k21.yak /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k31.yak
CMD: /path/00_software/nextPolish2 -t 5 /path/02_np2/05_eva/chm13/asm.fa.polish/hifi.minimap2.sort.bam /path/02_np2/05_eva/chm13/asm.fa.polish/asm.fa /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k21.yak /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k31.yak
Real time: 2465 sec; CPU: 11204 sec; Peak RSS: 74.113 GB


real	41m6.182s
user	119m19.423s
sys	67m25.821s
touch /path/02_np2/05_eva/chm13/asm.fa.polish/run1.sh.work/work4/subtask.sh.done
+ touch /path/02_np2/05_eva/chm13/asm.fa.polish/run1.sh.work/work4/subtask.sh.done

and here are the input files size:

2.7 GB 23-Oct-30 16:24:09 /path/02_np2/05_eva/chm13/asm.fa.polish/asm.fa
23 GB 23-Nov-05 20:08:59 /path/02_np2/05_eva/chm13/asm.fa.polish/hifi.minimap2.sort.bam
23 GB 23-Nov-06 05:29:38 /path/02_np2/05_eva/chm13/asm.fa.polish/hifi.winnowmap.sort.bam
38 GB 23-Nov-01 23:05:00 /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k21.yak
35 GB 23-Nov-05 07:19:30 /path/02_np2/05_eva/chm13/asm.fa.polish/sr.k31.yak

So you can compare this information with your input and figure out what the problem is? My guess is that your system has a serious IO bottleneck or your input genomes contain many short contigs (<1Mb).

I see, if the benchmark is on CHM13, then it should be due to short contigs. The statistics of my assembly is like:

Statics of contigs:
Number of contigs	865
Number of contigs > 10000 bp	865
Number of contigs >1000000 bp	149
Total length	3024706003
Total length of contigs > 10000 bp	3024706003
Total length of contigs >1000000bp	2914144160
Longest contig	137403523
Second longest contig length	131182353
N50	50325863
N50 of contigs >1Mbp	50325863

Then my question is if the tool is suitable for non-T2T assemblies? Are there any results for polishing hifi-only assemblies? It is really hard and expensive to generate genome assemblies with high quality like HG002 or CHM13

Besides, why your CHM13 is only 2.7G? This is much less than human genome and the CHM13 fasta I have.

  1. It is suitable for any assemblies.
  2. The CHM13 file size is 2.7Gb however the genome size is 2951058554.