facebookresearch/CodeGen

Evaluate Transcoder_model_1 on CodeXGlue benchmark

LANCHERBA opened this issue · 3 comments

Hi,
I am trying to follow the instruction in dobf.md to evaluate Transcoder_model_1.pth on Clone detection. After I run following command, the error related to reloading the model appears. I wonder if I did something wrong or if the script needs to be modified to evaluate Transcoder model on CodeXGlue.

SOURCEDIR=/home/h6ju/CodeGen
MODEL=/home/h6ju/CodeGen/TransCoder_model_1.pth
lr=2.5e-5
export PYTHONPATH=/home/h6ju/CodeGen
source $SOURCEDIR/newCodeGen/bin/activate

cd CodeXGLUE/Code-Code/Clone-detection-BigCloneBench/code; bash run_xlm_general.sh $MODEL 12 05 roberta_java TransCoder_model_1 $lr 2>&1 | tee logs/TransCoder_model_1_roberta_java_05_12_lr$lr.log

Then, following error comes out:

tee: logs/TransCoder_model_1_roberta_java_05_12_lr2.5e-5.log: No such file or directory
adding to path /home/h6ju/CodeGen
05/18/2022 17:29:14 - WARNING - __main__ -   Process rank: -1, device: cuda, n_gpu: 1, distributed training: False, 16-bits training: False
/home/h6ju/CodeGen/TransCoder_model_1.pth
Traceback (most recent call last):
  File "run.py", line 642, in <module>
    main()
  File "run.py", line 596, in main
    model = model_class.from_pretrained(args.model_name_or_path,
  File "/home/h6ju/CodeGen/codegen_sources/wrappers/models.py", line 160, in from_pretrained
    model.reload_model(model_path)
  File "/home/h6ju/CodeGen/codegen_sources/wrappers/models.py", line 124, in reload_model
    self.transformer.load_state_dict(model_reloaded, strict=True)
  File "/project/6001884/h6ju/CodeGen/newCodeGen/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1406, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for TransformerModel:
        size mismatch for position_embeddings.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([2048, 1024]).

Even I modified strict=True to strict=False in models.py, the same error still appears.
Thank you for your great help!

Hi,
TransCoder is not really made for pre-training a model for things like clone detection. What you are trying to do would reload only the encoder of TransCoder and fine-tune for clone detection.
It seems like the size of the model in the checkpoint doesn't match the config. I believe it is due to you using a config with a RoBERTa size model (12 layers, dim 1024) while TransCoder had 6 layers of dim 2048. I would have expected both the config and model to be reloaded from your $MODEL checkpoint but it seems like it is not happening.
The xlm_java model type would use the right tokenizer for this model, but I suspect you will have the same parameter mismatch error.

You can also use one of the RoBERTa-size models that we trained to compare ourselves to CodeBERT and GraphCodeBERT such as this one: https://dl.fbaipublicfiles.com/transcoder/pre_trained_models/dobf_plus_denoising.pth

Hi,
Thanks for your reply! Based on that, could I ask will the same error happen again if I try to evaluate TransCoder on any other CodeXGlue benchmark such as Code to Code translation or Code Completion? Thanks again! If TransCoder does not fit with CodeXGlue, I will try dobf_plus_denoising model instead!

Hi,
We definitely managed to test models with the same encoder parameters as TransCoder on CodeXGlue before. I did not test it recently, and I guess there will still be a bug with xlm_java instead of roberta_java based on your stack trace.
I would need to look into that further to solve this bug.