open2c/bioframe

Bring back GTF attributes parser?

Opened this issue · 2 comments

Options:

  • parse_gtf_attributes as read_gtf() with an option to parse
  • wrap an existing external tool for gtf parsing and return a pandas dataframe
  • keep it as a separate operation to apply to any key-value like column.

any thoughts on which option would be best cc @nvictus @agalitsyna ? It seemed like there was quite the extensive discussion in #123.

@smitkadvani was interested in implementing whichever solution made the most sense as it might clean up some code in one of his ongoing projects

for what it's worth I have not had any issues with gtfparse on neither GENCODE or ENSEMBL.

So we could just have a wrapper around gtfparse.read_gtf, and rename the'seqname' to 'chrom' to have a valid bedframe ?