feature proposal: continue model pre-train
Closed this issue · 3 comments
Mr-Ye-Cao commented
when caption data is a strong bottleneck, i.e. i have lots of similar-category image data without captioning, it would be nice if we offer continue pre-training for model over images only.
bghira commented
what are you expecting to happen?
Mr-Ye-Cao commented
hope is that Flux can now see dataset previously not seen during initial training, and can generalize over it: we now only need to provide a small number of captioned supervised image data to train a more robust model.
bghira commented
if you don't use captions during training for a large enough run, it will just forget how to use them. it will not generalise that way.