Projection file returns spots for genes not in RGPs
Closed this issue · 3 comments
I noticed that the projections file can return spots for genes that are not listed as having an RGP. My understanding from the panRGP manuscript is that spots are by definition groups of RGPs in the same genomic location. I was wondering whether this is a bug, or that I misunderstood the spot concept. I will provide my output files below. Projections were created using ppanggolin write --projection
. Thanks in advance!
Pangenome file: https://drive.proton.me/urls/R2AMAWJW3W#qZuCxaOzf2o9
Projection file, where the first row has spots but no RGPs: https://drive.proton.me/urls/QX64FPZK5M#xEZFoUROFMDp
Perhaps related: a gene in a particular RGP can have spots listed, which other genes in the same RGP do not have listed. For example, the top two rows have one spot and the third row has multiple spots, while they belong to the same RGP. These come from the projection file linked above.
gene | contig | start | stop | strand | family | nb_copy_in_org | partition | persistent_neighbors | shell_neighbors | cloud_neighbors | RGPs | Spots | Modules |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
BUR30_RS22995 | NZ_FRFY01000079.1 | 336 | 1058 | - | BUR30_RS22995 | 1 | cloud | 0 | 0 | 0 | NZ_FRFY01000079.1_RGP_0 | 95 | None |
BUR30_RS23000 | NZ_FRFY01000079.1 | 1208 | 1552 | + | BUR30_RS23000 | 1 | cloud | 0 | 0 | 0 | NZ_FRFY01000079.1_RGP_0 | 95 | None |
BUR30_RS23005 | NZ_FRFY01000079.1 | 1555 | 1989 | + | BUI81_RS18235 | 1 | cloud | 0 | 0 | 0 | NZ_FRFY01000079.1_RGP_0 | 95,82,75,89 | module_67 |
Hi,
Indeed very much related, it looks like this field is filled by listing the spots in which the gene's family is in, and not the spot of the gene itself !
Thank you for the bug report, someone will fix it in the upcoming release
Adelme
Great, thanks for getting back to me!