reddy-lab-code-research/StructCoder

question about pre-trained model for translation

Closed this issue · 1 comments

I am trying to run --do_test on translation without training. Do you have "best-blue" model for --do_test? When I try to use the same model as the one it was shared in google drive, I get a mismatch error.

File "run_translation.py", line 998, in <module> main() File "run_translation.py", line 942, in main module.load_state_dict(torch.load(os.path.join(args.output_dir, 'checkpoint-best-bleu/pytorch_model.bin'))) File "C:\Users\djaek\anaconda3\envs\torch_env\lib\site-packages\torch\nn\modules\module.py", line 1051, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for Seq2Seq: size mismatch for ast_type_emb.weight: copying a param with shape torch.Size([484, 768]) from checkpoint, the shape in current model is torch.Size([287, 768]). size mismatch for ast_path_head.weight: copying a param with shape torch.Size([5808, 128]) from checkpoint, the shape in current model is torch.Size([3444, 128]).

Any help would be appreciated.

Hi, We have updated the codes and the pretrained checkpoint. The error while loading checkpoints should be fixed now. We have not released the finetuning checkpoints yet but we might do it in the future.