calico/basenji

How to prepare TFRecords files for saluki training

Closed this issue · 1 comments

Thank you for publishing these great resources.
"basenji/jupyter/saluki_data.ipynb" demonstrates how to prepare TFRecords files for saluki.
The TFRecords files in the ipynb file contain "lengths", ""utr5", "cds", "utr3", "features", and "targets" features.
But, actual TFRecords files used for saluki training (https://zenodo.org/record/6326409#.ZGrDWXZBxD8) include "lengths", ""sequence", "coding", "splice", and "targets" features.
Do you plan to make it public how you prepared these TFRecords files?

Hi, that notebook represents a prior version of the tfrecord generation workflow. I replaced it with our final version in the latest push to the master branch.