Support multi-dimensional TMAPs in output

Question

Support multi-dimensional TMAPs in output

paolodi opened this issue 4 years ago · 1 comments

What
Multi-dimensional TMAPs (e.g., operations on raw data) are currently second-class citizens for execution modes that are expected to produce outputs. For example, inference of segmentation models either produces summary stats of goodness of segmentation (via infer_with_pixels), or PNGs stripped down of the metadata required for interpretation and reuse (via plot_predictions). explore mode bypasses multi-dimensional TMAPs altogether.

Why
The capability of easily re-using evaluated TMAPs (via inference or explore) is one of the key features of ML4H, and has already allowed us to perform "extrapolation" tasks where ML is used to infer a learned "rare" feature on an extended dataset (e.g., liver fat from standard MRI, LV mass and HRR from resting ECGs etc.). So far, we have fully supported only scalar features by exchanging CSV files, while ongoing work on segmentation and parameterization would require extensions to more complex multidimensional data.

How
Allowing outputs in more sophisticated file formats (e.g., HDF5 as a start) that can handle multidimensional (semi-)structured data. TMAPs contain enough information to interpret the data and guide the storage. As producing multi-dimensional outputs is not always needed (and potentially slow), the behavior should be activated only by optional command line flags.

Acceptance Criteria

Multidimensional TMAPs are accounted for in inference and explore modes
Users can produce files containing evaluated multi-dimensional TMAPs preserving metadata
Users can re-use the output files and the same TMAPs as model inputs

Answer 1 · 2020-10-09T18:21:15.000Z

@lucidtronix @ndiamant, this is the structured TMAP issue I mentioned before. Please feel free to comment if you have any ideas or suggestions!

I will for sure ask for your help along the way, especially if we want to let this work with autoencoders...