Finetuning
Ehteshamciitwah opened this issue · 0 comments
Ehteshamciitwah commented
Hello, thank you for sharing your work.
I checked parseq [32,128] pre-trained model for the custom dataset. the sample images are attached. The length of labels ranges from 3 to 20.
However, the word accuracy on the dataset using pre-trained weight is just 56. I fine-tuned your model with default parameters. but it increases to 72% only.
What is the best way to fine-tune your model for the custom dataset?
- Input image dimension/patch size
- Encoder parameters (layers, head,ratio)
- Decoder parameters (layers,head,ratio)
- decoding scheme
- Permutation K value
- Any additional recommendations?
Additionally, how can we integrate a dictionary with the parseq models? i am looking for your response.