About pretrain model

Question

About pretrain model

Closed this issue a year ago · 3 comments

When I train according to the training script provided by the readme, I get the following information:

size mismatch for layers.0.blocks.0.attn.relative_coords_table: copying a param with shape torch.Size([1, 23, 23, 2]) from checkpoint, the shape in current model is torch.Size([1, 43, 43, 2]).
size mismatch for layers.0.blocks.0.attn.relative_position_index: copying a param with shape torch.Size([144, 144]) from checkpoint, the shape in current model is torch.Size([484, 484]).
size mismatch for layers.0.blocks.1.attn.relative_coords_table: copying a param with shape torch.Size([1, 23, 23, 2]) from checkpoint, the shape in current model is torch.Size([1, 43, 43, 2]).
size mismatch for layers.0.blocks.1.attn.relative_position_index: copying a param with shape torch.Size([144, 144]) from checkpoint, the shape in current model is torch.Size([484, 484]).
size mismatch for layers.1.blocks.0.attn.relative_coords_table: copying a param with shape torch.Size([1, 23, 23, 2]) from checkpoint, the shape in current model is torch.Size([1, 43, 43, 2]).
。。。。。。

The pre-trained model does not match the current model shape size.
How can I solve this problem？I run exactly according to your script and did not modify any code。
Thanks！

Answer 1 · 2023-03-01T17:19:11.000Z

I am getting a similar problem, and also haven't modified the code at all.

Answer 2 · 2023-03-05T03:48:32.000Z

Bumping this up, does anyone know how to fix this?

Answer 3 · 2023-03-20T07:29:23.000Z

This issue has no impact because the relative_coords_table is computed directly during the initialization and does not require loading.