Lightning-AI/litgpt

test_tinyllama issue with LitData and `iterate_over_all`

Andrei-Aksionov opened this issue ยท 2 comments

Hi there ๐Ÿ‘‹

Apparently there is an issue with tinyllama test and the newest version of LitData (0.2.6).
In the release notes one can see that iterate_over_all has just been added:

Add support for iterate_over_all for the CombinedDataset by @tchaton in Lightning-AI/litdata#122

and that's why the issue didn't appear before.

Don't know whether this issue is on LitGPT or LitData side.
Maybe @awaelchli has any thoughts?

LitData made the decision to enforce iterate_over_all by default as a breaking change. LitGPT will have to set iterate_over_all=False explicitly now and require litdata>=0.2.6. The error message needs to be fixed though.

Yes, the default behaviour was confusing to some users. It felt more natural all the samples should be seen, especially when used for computing the validation metrics.

As @awaelchli shared, let's add iterate_over_all to LitGPT where needed.