tensorwerk/stockroom

Support diff

Opened this issue · 6 comments

Description

Hangar's diff is not good enough for stockroom since especially the model storage saves the weights in different arraysets/columns. Stockroom needs an abstracted diff that refines the diff data from hangar

Is this anything that could be improved on hangars end? Or is it just a detail of stockrooms implementation?

Just on stockroom. And I think I need to refine only the model diff because of the way stockroom stores model.

Ok. Good luck!

I would like to take up this issue. Can you provide a bit more info about this?

Hi @jjmachan,
So the difference in diff data between hangar and stockroom is nill for data storage. But when it comes to model storage, there will be few abstractions we need to do. This is because stockroom saves model weights (for a single model) in multiple columns. So we need to collate the diff from the hangar and make high-level diff before returning to the user. Similar to that, tags in the stockroom are currently built on top of the str typed column which possesses the history. But the idea of the tag is to make the information commit level (it acts like a commit message, you checkout to another commit and you lost the tagged data). Do these make sense to you? Feel free to ask any questions if you have? And meanwhile, I'll come up with an example that should make it easy to understand

So from what I understand tags are the most useful items for an effective diff rt? Since the diff is supposed to be read by the user the tags provide the most relevant information about the model architecture, hyperparameters, losses and accuracies etc. Those are the details that have to get reflected in the diff rt?