v4 <-> InferenceData integration
OriolAbril opened this issue · 2 comments
For now a dump of things to take into account regarding integration of PyMC v4 with InferenceData, which can now be improved signifiicantly further now that the converter is part of the PyMC codebase.
Goals:
- Converting from MultiTrace to InferenceData does not represent loosing of information -> transformed variables, metadata, values in report...
- feasible
- Storing sampler arguments, initial values, mass matrix... to allow reproducible sampling given a model object and an inferencedata. A bit related to arviz-devs/arviz#220
- might not be feasible yet
Related issues:
- arviz-devs/arviz#230
- arviz-devs/arviz#420
- Making the converter Dask compatible
- arviz-devs/arviz#1224
- arviz-devs/arviz#1257
- arviz-devs/arviz#1509
- arviz-devs/arviz#1748
Please add a way to return trace information from pm.sample
(https://discourse.pymc.io/t/inferencedata-is-missing-details-about-divergent-transitions/7740).
It might make sense to subclass inferencedata at some point to add extra conventions when reading/writing inferencedata files to netcdf or zarr. For example, it might be possible to retrieve the transformations on posterior variables from the names or attributes of posterior and unconstrained_posterior groups (even if not always). I have also not been following the modelbuilder work, but it might also help a bit there?