facebookresearch/ijepa

Using Pretrained Model

Giridhar9652 opened this issue · 4 comments

Is there any demo code available on how to implement the pre-trained model?

HI @Giridhar9652, are you interested in using a pretrained model for a downstream task, or in re-implementing the I-JEPA pretraining?

Abc11c commented

Hi @MidoAssran,

Not sure if this use case applies, do you think we can complete the image of a specific subject (aka human ?) with some fine-tuning.

(i.e., fixing the hands or fingers given the pic of whole human?)
Thanks!

Hi @Abc11c, if you have a responsible dataset setup for this sort of thing, I think your best bet may be to fine-tune the entire architecture (encoder, predictor, target-encoder) with this dataset. You would just need to swap out the data-loader initialization here with your own custom data loader.

You may also probably want to use an architecture with either a smaller patch size, or conversely increase the image resolution during pre-training.

I'll also just follow up by saying, you can try training the model to predict specific regions in each image that you want to explicitly model, i.e., mask out the regions you want the model to get better at predicting and use them as targets.