Shape mismatch error while loading the pretrained model
pragna96 opened this issue · 1 comments
THE ISSUES SECTION IS ONLY FOR FILING BUGS. PLEASE ASK YOUR QUESTION ON THE DISCUSSION TAB.
I get a shape mismatch error when I try to do the finetuning.
The same code was working perfectly working on collab, but when i try to run it on my GPU linux server, its giving error despite copying the same pretrained model files and gin files.
The error is as following :
ValueError: Shape of variable decoder/block_000/layer_000/SelfAttention/k:0 ((512, 512)) doesn't match with shape of tensor decoder/block_000/layer_000/SelfAttention/k ([1024, 4096]) from checkpoint reader.
The operative config file specifies the pre-trained checkpoint path, so either you're overriding that, or you're trying to use a directory that was already used to save checkpoints for a differently-sized model, and TF is trying to load those checkpoints (in which case you should just create a new empty model directory for saving your fine-tuned checkpoints).