Confidence score for paligemma

Question

Confidence score for paligemma

Opened this issue 6 months ago · 7 comments

Hi @NielsRogge
I have finetuned my paligemma for custom data for image to JSON use case, but when I inference it some key values I got wrong like 3000 is extracted as 9000 so to get the data is correct or not, can i get Confidence score with paligemma?

Answer 1 · 2024-05-30T20:30:29.000Z

Hi,

I'd recommend using a model with a higher image resolution (PaliGemma has models up to 896 resolution), provide more training examples, or train longer.

A confidence score can only be obtained for each token, as the model predicts one token at a time.

Answer 2 · 2024-05-30T20:54:39.000Z

@NielsRogge thanks for your reply actually, i used cord dataset too, 800 images training dataset on g5.xlarge machine for 8 epochs because of early stopping it stopped ig , you tried with same dataset, but apart from increasing the training dataset quality, I just want a confidence score so that i can use it as reference, can you help me with that like confidence score and heatmap, btw once more thanks for your reply i really follow your pages, that really interest me

Answer 3 · 2024-05-31T06:49:41.000Z

Ok, yeah I've not benchmarked it yet, it could be that a bigger model or models with higher image resolution perform better. I have similar notebooks on CORD for Donut, LLaVa and Idefics2, the latter has 8 billion parameters so for that you'd need an A100 for fine-tuning, but it might be that performance is also a lot better.

However it could also be that PaliGemma requires longer training

Answer 4 · 2024-05-31T06:55:33.000Z

but in my case the training stopped due to early stopping ig should i increase the patience?, and are you planning to introduce confidence score for paligemma?

Answer 5 · 2024-06-01T07:28:41.000Z

Yes you could increase patience, for confidence score see this answer: https://discuss.huggingface.co/t/announcement-generation-get-probabilities-for-generated-output/30075?u=nielsr

Answer 6 · 2024-06-05T11:18:44.000Z

Hi @NielsRogge
Could you also share us hardware requirements for your fine-tuning PaliGemma on CORD. I am getting memory issue using colab's T4 GPU. You have mentioned that for inference T4 GPU is sufficient. Also I would like to extract items like checkboxes efficiently in JSON, my main objective is to match the level of Google's DocumentAI. Could you suggest some datasets for fine tuning where my input files are more like application & survey forms and not receipts.

Answer 7 · 2024-06-05T11:22:47.000Z

Hi @NielsRogge Could you also share us hardware requirements for your fine-tuning PaliGemma on CORD. I am getting memory issue using colab's T4 GPU. You have mentioned that for inference T4 GPU is sufficient. Also I would like to extract items like checkboxes efficiently in JSON, my main objective is to match the level of Google's DocumentAI. Could you suggest some datasets for fine tuning where my input files are more like application & survey forms and not receipts.

Hi @Sai-Monik I used g5.xlarge for finetuning it worked.