AttributeError in extract_vision_keys.py

Question

AttributeError in extract_vision_keys.py

Closed this issue 3 years ago · 1 comments

There is an AttributeError when running the extract_vision_keys.py:

Load model from snap/xmatching/bert_resnext/BEST.pth.model. Traceback (most recent call last): File "vokenization/extract_vision_keys.py", line 259, in <module> joint_model = torch.load(args.load_dir + '/BEST.pth.model') File "/data/home/cwq/.miniconda3/envs/xlm/lib/python3.6/site-packages/torch/serialization.py", line 595, in load return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args) File "/data/home/cwq/.miniconda3/envs/xlm/lib/python3.6/site-packages/torch/serialization.py", line 774, in _legacy_load result = unpickler.load() AttributeError: Can't get attribute 'gelu' on <module 'transformers.modeling_bert' from '/data/home/cwq/.miniconda3/envs/xlm/lib/python3.6/site-packages/transformers/modeling_bert.py'>

As required in the requirements.txt, my transformers version is 3.3.0.
Is it a version problem? Or something should be added to the modeling_bert.py?
Do you have any idea about this issue? Thank you.

Answer 1 · 2021-11-22T09:59:39.000Z

The problem is solved.
I added the following code to transformers/modeling_bert.py. Then it works!

def gelu(x):
    """ Original Implementation of the gelu activation function in Google Bert repo when initialy created.
        For information: OpenAI GPT's gelu is slightly different (and gives slightly different results):
        0.5 * x * (1 + torch.tanh(math.sqrt(2 / math.pi) * (x + 0.044715 * torch.pow(x, 3))))
        Also see https://arxiv.org/abs/1606.08415
    """
    return x * 0.5 * (1.0 + torch.erf(x / math.sqrt(2.0)))

def gelu_new(x):
    """ Implementation of the gelu activation function currently in Google Bert repo (identical to OpenAI GPT).
        Also see https://arxiv.org/abs/1606.08415
    """
    return 0.5 * x * (1 + torch.tanh(math.sqrt(2 / math.pi) * (x + 0.044715 * torch.pow(x, 3))))

def swish(x):
    return x * torch.sigmoid(x)