refactor intermediate data formats

Question

bryantChhun opened this issue 3 years ago · 0 comments

Before we can think about enhanced parallelization and pytorch dataloaders, we need to rethink the data formats for dynamorph.

For each stage of the pipeline, we should define the data type inputs and outputs better (file format, dimensionality, file name)

We primarily need:

Can we avoid data duplication? Are there intermediate stages that can avoid data duplication?