text/docs/tutorials/fine_tune_bert.ipynb : BERT not learning

Question

text/docs/tutorials/fine_tune_bert.ipynb : BERT not learning

sungengyi opened this issue 2 years ago · 5 comments

warmup_schedule = tfm.optimization.lr_schedule.LinearWarmup( warmup_learning_rate = 0, after_warmup_lr_sched = linear_decay, warmup_steps = warmup_steps) ... ... optimizer = tf.keras.optimizers.experimental.AdamW( lr = warmup_schedule, weight_decay=0.01, )

Training the BERT classifier with this optimizer, the model won't learn anything. I changed the optimizer back to the one that was used in the previous code:

optimizer = nlp.optimization.create_optimizer(2e-5, num_train_steps=num_train_steps, num_warmup_steps=warmup_steps)

I guess the problem is there might be a typo in the warmup_learning_rate=0 in the above code, but I haven't tried rerunning the notebook with that being changed.

Answer 1 · 2022-06-16T04:04:05.000Z

Thanks for pointing this out. It looks like nobody has commented yet, but it is being looked at.

Answer 2 · 2022-06-17T22:58:08.000Z

Hi,

I'm fixing this right now.

I'm pretty sure the culprit is this line, which is broken in 2.9 but is fixed in master:

https://github.com/keras-team/keras/blob/r2.9/keras/optimizers/optimizer_experimental/adamw.py#L173

I'm just going to switch to standard Adam, and drop the weight decay.

Answer 3 · 2022-06-18T02:24:55.000Z

@MarkDaoust
Hi!

Thank you! I used code from the previous tutorial (tf-models-official==2.4.0) and trained a 90+% accuracy classifier. Due to some keras updates, I couldn't use the old version of the code anymore. However, with the same hyperparams, I couldn't reproduce the same performance. I'm not an expert in TensorFlow so I was wondering what might be the problem... Do you have any ideas? :)

Answer 4 · 2022-06-21T19:26:58.000Z

I've made some progress on the fix. Working on getting it submitted this week.

Answer 5 · 2022-06-27T20:48:00.000Z

I moved this to tensorflow/models, so it can be with its friends. Fixed it at the same time.

I dropped the AdamW optimizer in this version. Since the Keras version is buggy in TF2.9.