roary_2_pagoo Error: subscript contains invalid names
Closed this issue ยท 35 comments
Hi
I encountered the following error while trying to create R6 class object; This is the script I'm using and the corresponding error:
library(pagoo)
gffs <- list.files(pattern = "[.]gff$", recursive = TRUE, full.names = TRUE)
gpa_csv <- "/home/jason/Documents/pagoo/gene_presence_absence.csv"
p <- roary_2_pagoo(gene_presence_absence_csv = gpa_csv, gffs = gffs, sep = "__", paralog_sep = "\t")
Reading csv file (roary).
Processing csv file.
Reading gff file ./10432_62_LANL.gff
Reading gff file ./107V1216_BRAC.gff
Reading gff file ./1154_74_LANL.gff
Reading gff file ./11S_UM.gff
Reading gff file ./1346_SC.gff
Reading gff file ./1362_SC.gff
.
.
.
Reading gff file ./YB8E08_UA.gff
Reading gff file ./YN2011004_YPCDCP.gff
Reading gff file ./YN89004_YPCDCP.gff
Reading gff file ./YN97083_YPCDCP.gff
Error: subscript contains invalid names
Hi @jhcuarta , I'm out of office without my laptop the next two weeks. I'll put a reminder to address this as soon as I can. Sorry.
Bests!
Its crashing on these lines in roary_2_pagoo script
'## Selected columns
cols <- c('seqid', 'type', 'start', 'end', 'strand', 'product', 'org', 'locus_tag')
mcls <- lapply(mcls, function(x) x[ , cols])
You are not reading in any 'product' in your read_gff function
Thanks @malihaaziz !! And very sorry @jhcuarta, I completely forgot about this issue after my vacations.
I'm not able to reproduce the error. Could you provide a reproducible example? As small as possible please ๐ฌ . I would need the gffs and the gene_presence_absence.csv file. Could use wetransfer or send to my email. iferres at pasteur dot edu dot uy
Hi
@iferres I edited all the genome sequences and right now I'm re annotating, hope to run roary soon and send you the files next week. Beg your pardon
Best regards
Don't worry, I'm the delayed person here.
I ran roary_2_pagoo with version 0.3.17 on a test dataset and everythings looks ok:
> suppressPackageStartupMessages(library(pagoo))
> gffs <- list.files("gffs/", full.names=T)
> csv <- "roary_out/gene_presence_absence.csv"
> p <- roary_2_pagoo(csv, gffs)
Reading csv file (roary).
Processing csv file.
Reading gff file gffs//Hinfluenzae_2019.gff
Reading gff file gffs//Hinfluenzae_86-028NP.gff
Reading gff file gffs//Hinfluenzae_CGSHiCZ412602.gff
Reading gff file gffs//Hinfluenzae_KR494.gff
Reading gff file gffs//Hinfluenzae_PittEE.gff
Reading gff file gffs//Hinfluenzae_R2846.gff
Reading gff file gffs//Hinfluenzae_R2866.gff
Reading gff file gffs//Hinfluenzae_Rd_KW20.gff
Reading gff file gffs//Hinfluenzae_strain_FDAARGOS_199.gff
Reading gff file gffs//Hinfluenzae_strain_NCTC11931.gff
Loading PgR6MS class object.
Checking class.
Checking dimnames.
Creating gid (gene ids).
Checking provided cluster metadata.
Creating panmatrix.
Populating class.
Checking input sequences.
Checking that sequence names matches with DataFrame.
Adding metadata to sequences.
Done.
im running into the same error with both roary_2_pagoo and panaroo_2_pagoo hence the reason why I debugged your code. My genomes are prokka annotated. im a bit surprised that you are not running into this error
I have created a test that is crashing. Can you please test it with your version of the code. this is a panaroo generated csv.
gff-test.zip
gene_presence_absence-test.csv
@malihaaziz Yep, at least the panaroo csv file you send me looks different from the one I used to test the function. There are two bugs, one pops up when passing only the csv, and the other one with the csv and gffs files. I want to take some time to see if there are many versions of the csv and what is happening with the gffs. I don't want to fix it for this case and break it for the others. By the way, which version of panaroo are you using?
(base) [mlaziz@log002 ~]$ panaroo --version
panaroo 1.2.9
I dowloaded the latest version of panaroo (Version- 1.3.2) and re-ran it on my gffs. I still get the same error when i run pagoo..
Thank you, I will look into this next week.
Hi, which version of prokka are you using? I find some discrepancies between the ones you send me and the once I got with version 1.14.6-0
(conda). For instance, I see duplicated gene names:
==> 05_A8.gff <==
##gff-version 3
##sequence-region gnl|LIUPRICE|05_A8_5_1 1 1927179
gnl|LIUPRICE|05_A8_5_1 prokka gene 1 1338 . + . ID=05_A8_5_00001_gene;Name=dnaA;gene=dnaA;locus_tag=05_A8_5_00001
gnl|LIUPRICE|05_A8_5_1 Prodigal:002006 CDS 1 1338 . + 0 ID=05_A8_5_00001;Parent=05_A8_5_00001_gene;Name=dnaA;db_xref=COG:COG0593;gene=dnaA;inference=ab initio prediction:Prodigal:002006,similar to AA sequence:UniProtKB:P05648;locus_tag=05_A8_5_00001;product=Chromosomal replication initiator protein DnaA;protein_id=gnl|LIUPRICE|05_A8_5_00001
gnl|LIUPRICE|05_A8_5_1 prokka gene 1569 2705 . + . ID=05_A8_5_00002_gene;Name=dnaN;gene=dnaN;locus_tag=05_A8_5_00002
gnl|LIUPRICE|05_A8_5_1 Prodigal:002006 CDS 1569 2705 . + 0 ID=05_A8_5_00002;Parent=05_A8_5_00002_gene;Name=dnaN;db_xref=COG:COG0592;gene=dnaN;inference=ab initio prediction:Prodigal:002006,similar to AA sequence:UniProtKB:P05649;locus_tag=05_A8_5_00002;product=Beta sliding clamp;protein_id=gnl|LIUPRICE|05_A8_5_00002
gnl|LIUPRICE|05_A8_5_1 prokka gene 2921 3157 . + . ID=05_A8_5_00003_gene;locus_tag=05_A8_5_00003
gnl|LIUPRICE|05_A8_5_1 Prodigal:002006 CDS 2921 3157 . + 0 ID=05_A8_5_00003;Parent=05_A8_5_00003_gene;inference=ab initio prediction:Prodigal:002006;locus_tag=05_A8_5_00003;product=hypothetical protein;protein_id=gnl|LIUPRICE|05_A8_5_00003
gnl|LIUPRICE|05_A8_5_1 prokka gene 3160 4305 . + . ID=05_A8_5_00004_gene;Name=recF_1;gene=recF_1;locus_tag=05_A8_5_00004
gnl|LIUPRICE|05_A8_5_1 Prodigal:002006 CDS 3160 4305 . + 0 ID=05_A8_5_00004;Parent=05_A8_5_00004_gene;Name=recF_1;db_xref=COG:COG1195;gene=recF_1;inference=ab initio prediction:Prodigal:002006,similar to AA sequence:UniProtKB:Q8RDL3;locus_tag=05_A8_5_00004;product=DNA replication and repair protein RecF;protein_id=gnl|LIUPRICE|05_A8_5_00004
Probably is a flag you pass to prokka to sub divide annotations as "prodigal" and "prokka" (see second column on each entry).
Yes in did, it's the --compliant
flag in prokka. I'm not sure the treatment panaroo
and roary
do over this gff variant, I would suggest to re run prokka without the --compliant
flag.
Also, there is a bug which rises an error when reading the csv, without the gffs. I'm pushing some changes to address that. I will let you know.
thankyou for troubleshooting this. Ill switch over everything to Bakta..
Now it should work with your gene_presence_absence.csv
file:
#Reinstall pagoo from source
devtools::install_github("iferres/pagoo") # installs 0.3.18
library(pagoo)
p <- panaroo_2_pagoo("gene_presence_absence.csv")
Prokka is ok, the thing which is causing the issue is the --compliant
flag.
Hi iferres
The same error continues to occur
setwd("~/Documents/pagoo")
suppressPackageStartupMessages(library(pagoo))
gffs <- list.files("/home/jason/Documents/pagoo/gffs", full.names=T)
csv <- "gene_presence_absence.csv"
p <- roary_2_pagoo(csv, gffs)
Reading csv file (roary).
Processing csv file.
Reading gff file /home/jason/Documents/pagoo/gffs/10432_62_LANL.gff
Reading gff file /home/jason/Documents/pagoo/gffs/107V1216_BRAC.gff
Reading gff file /home/jason/Documents/pagoo/gffs/1154_74_LANL.gff
Reading gff file /home/jason/Documents/pagoo/gffs/11S_UM.gff
Reading gff file /home/jason/Documents/pagoo/gffs/1346_SC.gff
Reading gff file /home/jason/Documents/pagoo/gffs/1362_SC.gff
Reading gff file /home/jason/Documents/pagoo/gffs/146N_ILS.gff
Reading gff file /home/jason/Documents/pagoo/gffs/146P_ILS.gff
.
.
.
Reading gff file /home/jason/Documents/pagoo/gffs/YB7A06_UA.gff
Reading gff file /home/jason/Documents/pagoo/gffs/YB7A09_UA.gff
Reading gff file /home/jason/Documents/pagoo/gffs/YB8E08_UA.gff
Reading gff file /home/jason/Documents/pagoo/gffs/YN2011004_YPCDCP.gff
Reading gff file /home/jason/Documents/pagoo/gffs/YN89004_YPCDCP.gff
Reading gff file /home/jason/Documents/pagoo/gffs/YN97083_YPCDCP.gff
Error: subscript contains invalid names
I annotated all genomes usisng prokka 1.14.6, usisng the following commandline
prokka --setupdb && prokka --genus Vibrio --species cholerae --usegenus --prefix 1Mo_UM --cpus 8 --outdir 1Mo_UM --rfam --addgenes --addmrna --cdsrnaolap 1Mo_UM.fna
On the other hand I used roary 3.13.0
Here are the links to the respective files
https://drive.google.com/file/d/1xTmiJMs12Du8e069oUoH0lJiCJE8EpPW/view?usp=sharing
https://drive.google.com/file/d/13d4fhOEKoaOCZa7V_8HrC5JRJ0uuQba4/view?usp=sharing
Do you think you could make a smaller reproducible example? My little laptop is screaming with 500Mb of compressed gff files ๐ . Make sure you capture the same error with the smaller dataset.
Ah, there's a similar issue with the gffs, every gene has 2 entries: one with tag "gene", and the other "mRNA". Let me see if I can make pagoo handle these cases, otherwise I will continue receiving issues like this one. Thanks both of you for reporting!
Now it should work @jhcuarta
#Reinstall pagoo from source
devtools::install_github("iferres/pagoo")
library(pagoo)
p <- panaroo_2_pagoo("gene_presence_absence.csv", list.files(path = "...", pattern="[.]gff$", full.names=T))
Hi iferres
Mine is roary_2_pagoo
Yep! sorry, is the same since it was a bug in an internal function which is called by both panaroo_2_pagoo and roary_2_pagoo.
Try roary_2_pagoo
and let me know.
Hi iferres
It run just fine but threw some warnings, is there a problem with those
Loading PgR6MS class object.
Checking class.
Checking dimnames.
Creating gid (gene ids).
Checking provided cluster metadata.
Creating panmatrix.
Populating class.
Checking input sequences.
Checking that sequence names matches with DataFrame.
Adding metadata to sequences.
Done.
There were 50 or more warnings (use warnings() to see the first 50)
warnings()
Warning messages:
1: In FUN(X[[i]], ...) :
metadata columns on input DNAStringSet object were dropped
2: In FUN(X[[i]], ...) :
metadata columns on input DNAStringSet object were dropped
3: In FUN(X[[i]], ...) :
metadata columns on input DNAStringSet object were dropped
4: In FUN(X[[i]], ...) :
metadata columns on input DNAStringSet object were dropped
5: In FUN(X[[i]], ...) :
metadata columns on input DNAStringSet object were dropped
.
.
.
45: In FUN(X[[i]], ...) :
metadata columns on input DNAStringSet object were dropped
46: In FUN(X[[i]], ...) :
metadata columns on input DNAStringSet object were dropped
47: In FUN(X[[i]], ...) :
metadata columns on input DNAStringSet object were dropped
48: In FUN(X[[i]], ...) :
metadata columns on input DNAStringSet object were dropped
49: In FUN(X[[i]], ...) :
metadata columns on input DNAStringSet object were dropped
50: In FUN(X[[i]], ...) :
metadata columns on input DNAStringSet object were dropped
Strange. I tried with your full dataset and had no warnings.
Can you paste here the output of sessionInfo()
?
Hi
this is the output
Loading PgR6MS class object.
Checking class.
Checking dimnames.
Creating gid (gene ids).
Checking provided cluster metadata.
Creating panmatrix.
Populating class.
Checking input sequences.
Checking that sequence names matches with DataFrame.
Adding metadata to sequences.
Done.
There were 50 or more warnings (use warnings() to see the first 50)
sessionInfo()
R version 4.2.2 Patched (2022-11-10 r83330)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.1 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=es_CO.UTF-8
[4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=es_CO.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=es_CO.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=es_CO.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] pagoo_0.3.18 ggplot2_3.4.1 Biostrings_2.66.0 GenomeInfoDb_1.34.9
[5] XVector_0.38.0 IRanges_2.32.0 S4Vectors_0.36.1 BiocGenerics_0.44.0
loaded via a namespace (and not attached):
[1] viridis_0.6.2 httr_1.4.4 sass_0.4.5
[4] tidyr_1.3.0 splines_4.2.2 jsonlite_1.8.4
[7] viridisLite_0.4.1 foreach_1.5.2 bslib_0.4.2
[10] shiny_1.7.4 assertthat_0.2.1 GenomeInfoDbData_1.2.9
[13] lattice_0.20-45 pillar_1.8.1 glue_1.6.2
[16] digest_0.6.31 GenomicRanges_1.50.2 RColorBrewer_1.1-3
[19] promises_1.2.0.1 colorspace_2.1-0 Matrix_1.5-3
[22] htmltools_0.5.4 httpuv_1.6.8 plyr_1.8.8
[25] pkgconfig_2.0.3 zlibbioc_1.44.0 purrr_1.0.1
[28] xtable_1.8-4 scales_1.2.1 webshot_0.5.4
[31] later_1.3.0 tibble_3.1.8 mgcv_1.8-41
[34] generics_0.1.3 ellipsis_0.3.2 DT_0.27
[37] cachem_1.0.6 withr_2.5.0 lazyeval_0.2.2
[40] cli_3.6.0 magrittr_2.0.3 crayon_1.5.2
[43] mime_0.12 heatmaply_1.4.2 fansi_1.0.4
[46] nlme_3.1-162 MASS_7.3-58.2 vegan_2.6-4
[49] shinydashboard_0.7.2 tools_4.2.2 registry_0.5-1
[52] data.table_1.14.6 lifecycle_1.0.3 stringr_1.5.0
[55] plotly_4.10.1 munsell_0.5.0 cluster_2.1.4
[58] compiler_4.2.2 jquerylib_0.1.4 ca_0.71.1
[61] rlang_1.0.6 grid_4.2.2 RCurl_1.98-1.10
[64] iterators_1.0.14 rstudioapi_0.14 htmlwidgets_1.6.1
[67] bitops_1.0-7 shinyWidgets_0.7.6 gtable_0.3.1
[70] codetools_0.2-19 DBI_1.1.3 TSP_1.2-2
[73] reshape2_1.4.4 R6_2.5.1 seriation_1.4.1
[76] gridExtra_2.3 dplyr_1.1.0 fastmap_1.1.0
[79] utf8_1.2.3 ggfortify_0.4.15 permute_0.9-7
[82] dendextend_1.16.0 stringi_1.7.12 parallel_4.2.2
[85] Rcpp_1.0.10 vctrs_0.5.2 tidyselect_1.2.0
You shouldn't have any problems. I will try to update my setup these days to try to debug this warning, I'm working with slightly older versions of R and packages now. But pagoo
should work if could load the object since many checks are done when a pangenome object is initialized.
Hi , I am now using bakta and ran panaroo to get the pangenome
Im now getting this error
setwd("/lustre/groups/lab/mlaziz/Dp/panaroo-020923-bakta/020923_Dp-bakta-panarooV1.3.2")
gffs <- list.files(path = "/lustre/groups/lab/mlaziz/Dp/panaroo-020923-bakta/gff", pattern = "[.]gff3$", full.names = TRUE)
gpa_csv <- "gene_presence_absence.csv"
p <- panaroo_2_pagoo(gene_presence_absence_csv = gpa_csv, gffs = gffs)
Reading csv file (panaroo).
Processing csv file.
Removing 314 genes tagged as 'refound', 'stop', and/or 'length' by panaroo.
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/01_A1.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/02_B4.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/02_F4.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/03_C2.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/05_A8.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/33_A7.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/44MNt_B4_contigs.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/63VAs-B3_contigs.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/63VAs-Sm1_contigs.gff3
Error in .Call2("C_solve_user_SEW", refwidths, start, end, width, translate.negative.coord, :
solving row 847: 'allow.nonnarrowing' is FALSE and the supplied end (145482) is > refwidth
ive attached a shorter example . can you please test it with your version of code.
test.zip
I think I found it. The thing is that bakta is probably identifying features which starts near the end of the contig and finish after the beginning, but reports the end of the feature as an integer larger than the length of the contig. The parser I'm using get confused by this. I have some meetings right now, I will do my best to fix it in my afternoon today.
Fixed (I hope ๐
). I check both functions panaroo_2_pagoo
and roary_2_pagoo
with both of your datasets and pagoo load them fine.
Hi
Same error
Loading PgR6MS class object.
Checking class.
Checking dimnames.
Creating gid (gene ids).
Checking provided cluster metadata.
Creating panmatrix.
Populating class.
Checking input sequences.
Checking that sequence names matches with DataFrame.
Adding metadata to sequences.
Done.
There were 50 or more warnings (use warnings() to see the first 50)
warnings()
Warning messages:
1: In FUN(X[[i]], ...) :
metadata columns on input DNAStringSet object were dropped
2: In FUN(X[[i]], ...) :
metadata columns on input DNAStringSet object were dropped
3: In FUN(X[[i]], ...) :
metadata columns on input DNAStringSet object were dropped
4: In FUN(X[[i]], ...) :
metadata columns on input DNAStringSet object were dropped
.
.
.
47: In FUN(X[[i]], ...) :
metadata columns on input DNAStringSet object were dropped
48: In FUN(X[[i]], ...) :
metadata columns on input DNAStringSet object were dropped
49: In FUN(X[[i]], ...) :
metadata columns on input DNAStringSet object were dropped
50: In FUN(X[[i]], ...) :
metadata columns on input DNAStringSet object were dropped
Yes, sorry, I was referring to the bakta error. Just making sure that I didn't broke anything fixing that. I haven't had the time to look at those warnings in detail, but a quick search tells me that it's just Biostrings downgrading a S4 class to a BString and dropping unnecessary internal object metadata:
Just ignore them.
Thankyou! I repulled/installed pagoo via devtools. there is progress but now i see this error
p <- panaroo_2_pagoo(gene_presence_absence_csv = gpa_csv, gffs = gffs)
Reading csv file (panaroo).
Processing csv file.
Removing 314 genes tagged as 'refound', 'stop', and/or 'length' by panaroo.
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/01_A1.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/02_B4.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/02_F4.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/03_C2.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/05_A8.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/33_A7.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/44MNt_B4_contigs.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/63VAs-B3_contigs.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/63VAs-Sm1_contigs.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/68VAs-B3_contigs.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/68VPs-B6_contigs.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/81UNt-Sm4_contigs.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/83VAs-Sm8_contigs.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/83VPs-KB5_GCF_007197715.1_ASM719771v1_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/87UNt-Sm4_contigs.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/88MNs-Sm2_contigs.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/88VPs-Sm9_contigs.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/90VAs-B6_contigs.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/90VAs-Sm9_contigs.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/9VPs-B5_contigs.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/ATCC-51524_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/Dp_81Mnt_Sm4.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL1914_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL1922_CDC39-95_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL1931_CDC4294-98_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL1933_CDC4545-98_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL1934_CDC4709-98_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL1937_CDC4199-99_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL1939_CDC4792-99_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3033_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3043_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3050_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3052_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3065_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3069_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3070_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3077_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3084_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3086_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3090_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3246_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3250_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3256_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3264_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3274_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3911_genomic.gff3
Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, :
row names contain missing values
Did it manage to read all the gffs, or failed when reading the last one that appears in the log (KPL3911)?
Other question: which panaroo version are you using? The error looks similar to #57 .
panaroo (Version- 1.3.2)
bakta (Version-1.6.1)
i have 46 gffs in the analysis. it looks like pagoo breaks at 33rd. i can try the internal pagoo:::read_gff(gff_file) command to see which one does it hate