[EPIC] Extend `IOPointer`s to store values in addition to keys
shreyashankar opened this issue · 1 comments
Context: the current IOPointer
abstraction only stores a string "pointer" to the data, or a key. An example might be features.csv
or model.joblib
. This means we currently can't do anything with the data, because we don't store any concept of it. If we store data, we could do many things, including the following:
- Compare current values to historical
ComponentRun
s' values - Identify whether files have been tampered with outside of
ComponentRun
s - Have more fine-grained tracing (record-level)
Storing the data in its entirety may be expensive. For now we will store a hash of the data, to get us one step closer to being able to store the data. This itself may be complex.
Issues
In the future, we will incorporate an IOPointer
"tag" model to store information about fine-grained tracing (i.e., PK values will be tags). This tagging is out of scope from the current project.
Goal: Have all this done by September 15 EOD