roary_2_pagoo Error: subscript contains invalid names

Question

roary_2_pagoo Error: subscript contains invalid names

Closed this issue 2 years ago · 35 comments

Hi
I encountered the following error while trying to create R6 class object; This is the script I'm using and the corresponding error:

library(pagoo)

gffs <- list.files(pattern = "[.]gff$", recursive = TRUE, full.names = TRUE)

gpa_csv <- "/home/jason/Documents/pagoo/gene_presence_absence.csv"

p <- roary_2_pagoo(gene_presence_absence_csv = gpa_csv, gffs = gffs, sep = "__", paralog_sep = "\t")

Reading csv file (roary).
Processing csv file.
Reading gff file ./10432_62_LANL.gff
Reading gff file ./107V1216_BRAC.gff
Reading gff file ./1154_74_LANL.gff
Reading gff file ./11S_UM.gff
Reading gff file ./1346_SC.gff
Reading gff file ./1362_SC.gff
.
.
.
Reading gff file ./YB8E08_UA.gff
Reading gff file ./YN2011004_YPCDCP.gff
Reading gff file ./YN89004_YPCDCP.gff
Reading gff file ./YN97083_YPCDCP.gff
Error: subscript contains invalid names

Answer 1 · 2022-12-27T16:48:34.000Z

Hi @jhcuarta , I'm out of office without my laptop the next two weeks. I'll put a reminder to address this as soon as I can. Sorry.
Bests!

Answer 2 · 2023-02-02T19:18:03.000Z

Its crashing on these lines in roary_2_pagoo script
'## Selected columns
cols <- c('seqid', 'type', 'start', 'end', 'strand', 'product', 'org', 'locus_tag')
mcls <- lapply(mcls, function(x) x[ , cols])

You are not reading in any 'product' in your read_gff function

Answer 3 · 2023-02-02T19:49:47.000Z

Thanks @malihaaziz !! And very sorry @jhcuarta, I completely forgot about this issue after my vacations.

I'm not able to reproduce the error. Could you provide a reproducible example? As small as possible please 😬 . I would need the gffs and the gene_presence_absence.csv file. Could use wetransfer or send to my email. iferres at pasteur dot edu dot uy

Answer 4 · 2023-02-02T19:59:29.000Z

Hi
@iferres I edited all the genome sequences and right now I'm re annotating, hope to run roary soon and send you the files next week. Beg your pardon

Best regards

Answer 5 · 2023-02-02T20:11:24.000Z

Don't worry, I'm the delayed person here.
I ran roary_2_pagoo with version 0.3.17 on a test dataset and everythings looks ok:

> suppressPackageStartupMessages(library(pagoo))
> gffs <- list.files("gffs/", full.names=T)
> csv <- "roary_out/gene_presence_absence.csv"
> p <- roary_2_pagoo(csv, gffs)
Reading csv file (roary).
Processing csv file.
Reading gff file gffs//Hinfluenzae_2019.gff
Reading gff file gffs//Hinfluenzae_86-028NP.gff
Reading gff file gffs//Hinfluenzae_CGSHiCZ412602.gff
Reading gff file gffs//Hinfluenzae_KR494.gff
Reading gff file gffs//Hinfluenzae_PittEE.gff
Reading gff file gffs//Hinfluenzae_R2846.gff
Reading gff file gffs//Hinfluenzae_R2866.gff
Reading gff file gffs//Hinfluenzae_Rd_KW20.gff
Reading gff file gffs//Hinfluenzae_strain_FDAARGOS_199.gff
Reading gff file gffs//Hinfluenzae_strain_NCTC11931.gff
Loading PgR6MS class object.
Checking class.
Checking dimnames.
Creating gid (gene ids).
Checking provided cluster metadata.
Creating panmatrix.
Populating class.
Checking input sequences.
Checking that sequence names matches with DataFrame.
Adding metadata to sequences.
Done.

Answer 6 · 2023-02-02T20:16:04.000Z

im running into the same error with both roary_2_pagoo and panaroo_2_pagoo hence the reason why I debugged your code. My genomes are prokka annotated. im a bit surprised that you are not running into this error

Answer 7 · 2023-02-02T20:23:10.000Z

I have created a test that is crashing. Can you please test it with your version of the code. this is a panaroo generated csv.
gff-test.zip
gene_presence_absence-test.csv

Answer 8 · 2023-02-03T14:39:57.000Z

@malihaaziz Yep, at least the panaroo csv file you send me looks different from the one I used to test the function. There are two bugs, one pops up when passing only the csv, and the other one with the csv and gffs files. I want to take some time to see if there are many versions of the csv and what is happening with the gffs. I don't want to fix it for this case and break it for the others. By the way, which version of panaroo are you using?

Answer 9 · 2023-02-03T15:44:50.000Z

(base) [mlaziz@log002 ~]$ panaroo --version
panaroo 1.2.9

Answer 10 · 2023-02-04T04:35:21.000Z

I dowloaded the latest version of panaroo (Version- 1.3.2) and re-ran it on my gffs. I still get the same error when i run pagoo..

Answer 11 · 2023-02-04T15:55:39.000Z

Thank you, I will look into this next week.

Answer 12 · 2023-02-06T13:34:43.000Z

Hi, which version of prokka are you using? I find some discrepancies between the ones you send me and the once I got with version 1.14.6-0 (conda). For instance, I see duplicated gene names:

==> 05_A8.gff <==
##gff-version 3
##sequence-region gnl|LIUPRICE|05_A8_5_1 1 1927179
gnl|LIUPRICE|05_A8_5_1	prokka	gene	1	1338	.	+	.	ID=05_A8_5_00001_gene;Name=dnaA;gene=dnaA;locus_tag=05_A8_5_00001
gnl|LIUPRICE|05_A8_5_1	Prodigal:002006	CDS	1	1338	.	+	0	ID=05_A8_5_00001;Parent=05_A8_5_00001_gene;Name=dnaA;db_xref=COG:COG0593;gene=dnaA;inference=ab initio prediction:Prodigal:002006,similar to AA sequence:UniProtKB:P05648;locus_tag=05_A8_5_00001;product=Chromosomal replication initiator protein DnaA;protein_id=gnl|LIUPRICE|05_A8_5_00001
gnl|LIUPRICE|05_A8_5_1	prokka	gene	1569	2705	.	+	.	ID=05_A8_5_00002_gene;Name=dnaN;gene=dnaN;locus_tag=05_A8_5_00002
gnl|LIUPRICE|05_A8_5_1	Prodigal:002006	CDS	1569	2705	.	+	0	ID=05_A8_5_00002;Parent=05_A8_5_00002_gene;Name=dnaN;db_xref=COG:COG0592;gene=dnaN;inference=ab initio prediction:Prodigal:002006,similar to AA sequence:UniProtKB:P05649;locus_tag=05_A8_5_00002;product=Beta sliding clamp;protein_id=gnl|LIUPRICE|05_A8_5_00002
gnl|LIUPRICE|05_A8_5_1	prokka	gene	2921	3157	.	+	.	ID=05_A8_5_00003_gene;locus_tag=05_A8_5_00003
gnl|LIUPRICE|05_A8_5_1	Prodigal:002006	CDS	2921	3157	.	+	0	ID=05_A8_5_00003;Parent=05_A8_5_00003_gene;inference=ab initio prediction:Prodigal:002006;locus_tag=05_A8_5_00003;product=hypothetical protein;protein_id=gnl|LIUPRICE|05_A8_5_00003
gnl|LIUPRICE|05_A8_5_1	prokka	gene	3160	4305	.	+	.	ID=05_A8_5_00004_gene;Name=recF_1;gene=recF_1;locus_tag=05_A8_5_00004
gnl|LIUPRICE|05_A8_5_1	Prodigal:002006	CDS	3160	4305	.	+	0	ID=05_A8_5_00004;Parent=05_A8_5_00004_gene;Name=recF_1;db_xref=COG:COG1195;gene=recF_1;inference=ab initio prediction:Prodigal:002006,similar to AA sequence:UniProtKB:Q8RDL3;locus_tag=05_A8_5_00004;product=DNA replication and repair protein RecF;protein_id=gnl|LIUPRICE|05_A8_5_00004

Probably is a flag you pass to prokka to sub divide annotations as "prodigal" and "prokka" (see second column on each entry).

Answer 13 · 2023-02-06T14:37:29.000Z

Yes in did, it's the --compliant flag in prokka. I'm not sure the treatment panaroo and roary do over this gff variant, I would suggest to re run prokka without the --compliant flag.

Also, there is a bug which rises an error when reading the csv, without the gffs. I'm pushing some changes to address that. I will let you know.

Answer 14 · 2023-02-06T14:42:42.000Z

thankyou for troubleshooting this. Ill switch over everything to Bakta..

Answer 15 · 2023-02-06T14:56:33.000Z

Now it should work with your gene_presence_absence.csv file:

#Reinstall pagoo from source
devtools::install_github("iferres/pagoo") # installs 0.3.18

library(pagoo)
p <- panaroo_2_pagoo("gene_presence_absence.csv")

Prokka is ok, the thing which is causing the issue is the --compliant flag.

Answer 16 · 2023-02-11T16:55:26.000Z

Hi iferres
The same error continues to occur

setwd("~/Documents/pagoo")
suppressPackageStartupMessages(library(pagoo))
gffs <- list.files("/home/jason/Documents/pagoo/gffs", full.names=T)
csv <- "gene_presence_absence.csv"
p <- roary_2_pagoo(csv, gffs)
Reading csv file (roary).
Processing csv file.
Reading gff file /home/jason/Documents/pagoo/gffs/10432_62_LANL.gff
Reading gff file /home/jason/Documents/pagoo/gffs/107V1216_BRAC.gff
Reading gff file /home/jason/Documents/pagoo/gffs/1154_74_LANL.gff
Reading gff file /home/jason/Documents/pagoo/gffs/11S_UM.gff
Reading gff file /home/jason/Documents/pagoo/gffs/1346_SC.gff
Reading gff file /home/jason/Documents/pagoo/gffs/1362_SC.gff
Reading gff file /home/jason/Documents/pagoo/gffs/146N_ILS.gff
Reading gff file /home/jason/Documents/pagoo/gffs/146P_ILS.gff
.
.
.
Reading gff file /home/jason/Documents/pagoo/gffs/YB7A06_UA.gff
Reading gff file /home/jason/Documents/pagoo/gffs/YB7A09_UA.gff
Reading gff file /home/jason/Documents/pagoo/gffs/YB8E08_UA.gff
Reading gff file /home/jason/Documents/pagoo/gffs/YN2011004_YPCDCP.gff
Reading gff file /home/jason/Documents/pagoo/gffs/YN89004_YPCDCP.gff
Reading gff file /home/jason/Documents/pagoo/gffs/YN97083_YPCDCP.gff
Error: subscript contains invalid names

I annotated all genomes usisng prokka 1.14.6, usisng the following commandline
prokka --setupdb && prokka --genus Vibrio --species cholerae --usegenus --prefix 1Mo_UM --cpus 8 --outdir 1Mo_UM --rfam --addgenes --addmrna --cdsrnaolap 1Mo_UM.fna

On the other hand I used roary 3.13.0

Here are the links to the respective files
https://drive.google.com/file/d/1xTmiJMs12Du8e069oUoH0lJiCJE8EpPW/view?usp=sharing
https://drive.google.com/file/d/13d4fhOEKoaOCZa7V_8HrC5JRJ0uuQba4/view?usp=sharing

Answer 17 · 2023-02-13T13:40:55.000Z

Thank you @jhcuarta , I will look at it today.

Answer 18 · 2023-02-13T14:00:22.000Z

Do you think you could make a smaller reproducible example? My little laptop is screaming with 500Mb of compressed gff files 😅 . Make sure you capture the same error with the smaller dataset.

Answer 19 · 2023-02-13T14:09:54.000Z

Ah, there's a similar issue with the gffs, every gene has 2 entries: one with tag "gene", and the other "mRNA". Let me see if I can make pagoo handle these cases, otherwise I will continue receiving issues like this one. Thanks both of you for reporting!

Answer 20 · 2023-02-13T17:22:31.000Z

Now it should work @jhcuarta

#Reinstall pagoo from source
devtools::install_github("iferres/pagoo") 

library(pagoo)
p <- panaroo_2_pagoo("gene_presence_absence.csv", list.files(path = "...", pattern="[.]gff$", full.names=T))

Answer 21 · 2023-02-13T17:26:42.000Z

Hi iferres
Mine is roary_2_pagoo

Answer 22 · 2023-02-13T17:29:09.000Z

Yep! sorry, is the same since it was a bug in an internal function which is called by both panaroo_2_pagoo and roary_2_pagoo.
Try roary_2_pagoo and let me know.

Answer 23 · 2023-02-13T18:31:31.000Z

Hi iferres
It run just fine but threw some warnings, is there a problem with those

Loading PgR6MS class object.
Checking class.
Checking dimnames.
Creating gid (gene ids).
Checking provided cluster metadata.
Creating panmatrix.
Populating class.
Checking input sequences.
Checking that sequence names matches with DataFrame.
Adding metadata to sequences.
Done.
There were 50 or more warnings (use warnings() to see the first 50)

warnings()
Warning messages:
1: In FUN(X[[i]], ...) :
metadata columns on input DNAStringSet object were dropped
2: In FUN(X[[i]], ...) :
metadata columns on input DNAStringSet object were dropped
3: In FUN(X[[i]], ...) :
metadata columns on input DNAStringSet object were dropped
4: In FUN(X[[i]], ...) :
metadata columns on input DNAStringSet object were dropped
5: In FUN(X[[i]], ...) :
metadata columns on input DNAStringSet object were dropped
.
.
.
45: In FUN(X[[i]], ...) :
metadata columns on input DNAStringSet object were dropped
46: In FUN(X[[i]], ...) :
metadata columns on input DNAStringSet object were dropped
47: In FUN(X[[i]], ...) :
metadata columns on input DNAStringSet object were dropped
48: In FUN(X[[i]], ...) :
metadata columns on input DNAStringSet object were dropped
49: In FUN(X[[i]], ...) :
metadata columns on input DNAStringSet object were dropped
50: In FUN(X[[i]], ...) :
metadata columns on input DNAStringSet object were dropped

Answer 24 · 2023-02-13T19:24:49.000Z

Strange. I tried with your full dataset and had no warnings.
Can you paste here the output of sessionInfo()?

Answer 25 · 2023-02-13T19:38:47.000Z

Hi
this is the output

Loading PgR6MS class object.
Checking class.
Checking dimnames.
Creating gid (gene ids).
Checking provided cluster metadata.
Creating panmatrix.
Populating class.
Checking input sequences.
Checking that sequence names matches with DataFrame.
Adding metadata to sequences.
Done.
There were 50 or more warnings (use warnings() to see the first 50)

sessionInfo()
R version 4.2.2 Patched (2022-11-10 r83330)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.1 LTS

Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=es_CO.UTF-8
[4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=es_CO.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=es_CO.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=es_CO.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods base

other attached packages:
[1] pagoo_0.3.18 ggplot2_3.4.1 Biostrings_2.66.0 GenomeInfoDb_1.34.9
[5] XVector_0.38.0 IRanges_2.32.0 S4Vectors_0.36.1 BiocGenerics_0.44.0

loaded via a namespace (and not attached):
[1] viridis_0.6.2 httr_1.4.4 sass_0.4.5
[4] tidyr_1.3.0 splines_4.2.2 jsonlite_1.8.4
[7] viridisLite_0.4.1 foreach_1.5.2 bslib_0.4.2
[10] shiny_1.7.4 assertthat_0.2.1 GenomeInfoDbData_1.2.9
[13] lattice_0.20-45 pillar_1.8.1 glue_1.6.2
[16] digest_0.6.31 GenomicRanges_1.50.2 RColorBrewer_1.1-3
[19] promises_1.2.0.1 colorspace_2.1-0 Matrix_1.5-3
[22] htmltools_0.5.4 httpuv_1.6.8 plyr_1.8.8
[25] pkgconfig_2.0.3 zlibbioc_1.44.0 purrr_1.0.1
[28] xtable_1.8-4 scales_1.2.1 webshot_0.5.4
[31] later_1.3.0 tibble_3.1.8 mgcv_1.8-41
[34] generics_0.1.3 ellipsis_0.3.2 DT_0.27
[37] cachem_1.0.6 withr_2.5.0 lazyeval_0.2.2
[40] cli_3.6.0 magrittr_2.0.3 crayon_1.5.2
[43] mime_0.12 heatmaply_1.4.2 fansi_1.0.4
[46] nlme_3.1-162 MASS_7.3-58.2 vegan_2.6-4
[49] shinydashboard_0.7.2 tools_4.2.2 registry_0.5-1
[52] data.table_1.14.6 lifecycle_1.0.3 stringr_1.5.0
[55] plotly_4.10.1 munsell_0.5.0 cluster_2.1.4
[58] compiler_4.2.2 jquerylib_0.1.4 ca_0.71.1
[61] rlang_1.0.6 grid_4.2.2 RCurl_1.98-1.10
[64] iterators_1.0.14 rstudioapi_0.14 htmlwidgets_1.6.1
[67] bitops_1.0-7 shinyWidgets_0.7.6 gtable_0.3.1
[70] codetools_0.2-19 DBI_1.1.3 TSP_1.2-2
[73] reshape2_1.4.4 R6_2.5.1 seriation_1.4.1
[76] gridExtra_2.3 dplyr_1.1.0 fastmap_1.1.0
[79] utf8_1.2.3 ggfortify_0.4.15 permute_0.9-7
[82] dendextend_1.16.0 stringi_1.7.12 parallel_4.2.2
[85] Rcpp_1.0.10 vctrs_0.5.2 tidyselect_1.2.0

Answer 26 · 2023-02-13T19:56:39.000Z

You shouldn't have any problems. I will try to update my setup these days to try to debug this warning, I'm working with slightly older versions of R and packages now. But pagoo should work if could load the object since many checks are done when a pangenome object is initialized.

Answer 27 · 2023-02-14T03:15:09.000Z

Hi , I am now using bakta and ran panaroo to get the pangenome
Im now getting this error

setwd("/lustre/groups/lab/mlaziz/Dp/panaroo-020923-bakta/020923_Dp-bakta-panarooV1.3.2")
gffs <- list.files(path = "/lustre/groups/lab/mlaziz/Dp/panaroo-020923-bakta/gff", pattern = "[.]gff3$", full.names = TRUE)
gpa_csv <- "gene_presence_absence.csv"
p <- panaroo_2_pagoo(gene_presence_absence_csv = gpa_csv, gffs = gffs)
Reading csv file (panaroo).
Processing csv file.
Removing 314 genes tagged as 'refound', 'stop', and/or 'length' by panaroo.
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/01_A1.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/02_B4.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/02_F4.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/03_C2.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/05_A8.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/33_A7.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/44MNt_B4_contigs.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/63VAs-B3_contigs.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/63VAs-Sm1_contigs.gff3
Error in .Call2("C_solve_user_SEW", refwidths, start, end, width, translate.negative.coord, :
solving row 847: 'allow.nonnarrowing' is FALSE and the supplied end (145482) is > refwidth

ive attached a shorter example . can you please test it with your version of code.
test.zip

Answer 28 · 2023-02-14T13:58:56.000Z

I think I found it. The thing is that bakta is probably identifying features which starts near the end of the contig and finish after the beginning, but reports the end of the feature as an integer larger than the length of the contig. The parser I'm using get confused by this. I have some meetings right now, I will do my best to fix it in my afternoon today.

Answer 29 · 2023-02-14T20:10:46.000Z

Fixed (I hope 😅 ). I check both functions panaroo_2_pagoo and roary_2_pagoo with both of your datasets and pagoo load them fine.

Answer 30 · 2023-02-14T21:16:21.000Z

Hi
Same error
Loading PgR6MS class object.
Checking class.
Checking dimnames.
Creating gid (gene ids).
Checking provided cluster metadata.
Creating panmatrix.
Populating class.
Checking input sequences.
Checking that sequence names matches with DataFrame.
Adding metadata to sequences.
Done.
There were 50 or more warnings (use warnings() to see the first 50)

warnings()
Warning messages:
1: In FUN(X[[i]], ...) :
metadata columns on input DNAStringSet object were dropped
2: In FUN(X[[i]], ...) :
metadata columns on input DNAStringSet object were dropped
3: In FUN(X[[i]], ...) :
metadata columns on input DNAStringSet object were dropped
4: In FUN(X[[i]], ...) :
metadata columns on input DNAStringSet object were dropped
.
.
.
47: In FUN(X[[i]], ...) :
metadata columns on input DNAStringSet object were dropped
48: In FUN(X[[i]], ...) :
metadata columns on input DNAStringSet object were dropped
49: In FUN(X[[i]], ...) :
metadata columns on input DNAStringSet object were dropped
50: In FUN(X[[i]], ...) :
metadata columns on input DNAStringSet object were dropped

Answer 31 · 2023-02-14T22:48:19.000Z

Yes, sorry, I was referring to the bakta error. Just making sure that I didn't broke anything fixing that. I haven't had the time to look at those warnings in detail, but a quick search tells me that it's just Biostrings downgrading a S4 class to a BString and dropping unnecessary internal object metadata:

https://github.com/Bioconductor/Biostrings/blob/c94e8fb082601cf3e3998df82cf1a9b39c72cb27/R/XStringSet-class.R#L331-L333

Just ignore them.

Answer 32 · 2023-02-15T02:28:47.000Z

Thankyou! I repulled/installed pagoo via devtools. there is progress but now i see this error

p <- panaroo_2_pagoo(gene_presence_absence_csv = gpa_csv, gffs = gffs)
Reading csv file (panaroo).
Processing csv file.
Removing 314 genes tagged as 'refound', 'stop', and/or 'length' by panaroo.
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/01_A1.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/02_B4.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/02_F4.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/03_C2.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/05_A8.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/33_A7.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/44MNt_B4_contigs.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/63VAs-B3_contigs.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/63VAs-Sm1_contigs.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/68VAs-B3_contigs.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/68VPs-B6_contigs.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/81UNt-Sm4_contigs.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/83VAs-Sm8_contigs.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/83VPs-KB5_GCF_007197715.1_ASM719771v1_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/87UNt-Sm4_contigs.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/88MNs-Sm2_contigs.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/88VPs-Sm9_contigs.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/90VAs-B6_contigs.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/90VAs-Sm9_contigs.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/9VPs-B5_contigs.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/ATCC-51524_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/Dp_81Mnt_Sm4.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL1914_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL1922_CDC39-95_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL1931_CDC4294-98_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL1933_CDC4545-98_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL1934_CDC4709-98_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL1937_CDC4199-99_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL1939_CDC4792-99_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3033_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3043_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3050_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3052_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3065_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3069_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3070_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3077_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3084_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3086_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3090_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3246_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3250_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3256_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3264_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3274_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3911_genomic.gff3
Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, :
row names contain missing values

Answer 33 · 2023-02-15T12:42:35.000Z

Did it manage to read all the gffs, or failed when reading the last one that appears in the log (KPL3911)?

Answer 34 · 2023-02-15T13:09:48.000Z

Other question: which panaroo version are you using? The error looks similar to #57 .

Answer 35 · 2023-02-16T18:58:16.000Z

panaroo (Version- 1.3.2)
bakta (Version-1.6.1)
i have 46 gffs in the analysis. it looks like pagoo breaks at 33rd. i can try the internal pagoo:::read_gff(gff_file) command to see which one does it hate