vanheeringen-lab/genomepy

DtypeWarning

siebrenf opened this issue · 0 comments

I guess it's the BED file where the strand column is causing this.

genomepy install [snip] -p url --URL-to-annotation https://github.com/jakke-neiro/Oxplatys/raw/gh-pages/Schmidtea_mediterranea_Oxford_v1.gtf.zip -l Smes
[snip]
14:24:25 | INFO | Annotation download successful
/home/siebrenf/git/genomepy/genomepy/annotation/sanitize.py:38: DtypeWarning: Columns (5) have mixed types.Specify dtype option on import or set low_memory=False.
  mc = filter_contigs(self)

bed:

head .local/share/genomes/Smes/Smes.annotation.bed | column -t
dd_Smes_g4_1  20345   21503   SMEST026673001.1  0  +  21503   21503   0  2  140,932,     0,226,
dd_Smes_g4_1  20461   21503   MSTRG.1.2         0  +  21503   21503   0  1  1042,        0,

gtf:

head .local/share/genomes/Smes/Smes.annotation.gtf | column -t
dd_Smes_g4_1  StringTie  transcript  20346  21503  1000  +  .  gene_id  "MSTRG.1";  transcript_id  "SMEST026673001.1";  ref_gene_id  "SMESG000026673.1";
dd_Smes_g4_1  StringTie  exon        20346  20485  1000  +  .  gene_id  "MSTRG.1";  transcript_id  "SMEST026673001.1";  exon_number  "1";                 ref_gene_id  "SMESG000026673.1";