sanger-pathogens/Roary

I couldn't find specific predicted coding regions in gene_presence_absence.csv.

huminfo8 opened this issue · 0 comments

Hello from Japan.
Now I'm analyzing 38 gff files of specific species of bacteria. Here is my code for running roary.

${SINGULARITY} exec --cleanenv ${ROARY_SIF} roary -p ${THREADS} -e --mafft -i 95 -f ${out_dir} ${gff_dir}/*.gff

This set of gff contains 86330 predicted coding regions in total, but I could find only 84740 predicted coding regions in gene_presence_absence.csv file.

Then, I checked the ffn files of missing 1590 regions, and the length was longer than 120 bp and contained no Ns. What could be the reason why I missed 1590 coding regions?