Finetuning

Question

Finetuning

Ehteshamciitwah opened this issue a year ago · 0 comments

Hello, thank you for sharing your work.

I checked parseq [32,128] pre-trained model for the custom dataset. the sample images are attached. The length of labels ranges from 3 to 20.

However, the word accuracy on the dataset using pre-trained weight is just 56. I fine-tuned your model with default parameters. but it increases to 72% only.

What is the best way to fine-tune your model for the custom dataset?

Input image dimension/patch size
Encoder parameters (layers, head,ratio)
Decoder parameters (layers,head,ratio)
decoding scheme
Permutation K value
Any additional recommendations?

Additionally, how can we integrate a dictionary with the parseq models? i am looking for your response.