Cannot start training with TensorFlow 2.0 and distribute.MirroredStrategy

Describe the Bug
Cannot start training with TensorFlow 2.0 and distribute.MirroredStrategy.

Version Info
TensorFlow 2.0beta1
Python 3.6.8

[ yes] I'm using the latest version

Minimal Codes To Reproduce

strategy = tf.distribute.MirroredStrategy(devices=FLAGS.compute_devices,
           cross_device_ops=tf.distribute.HierarchicalCopyAllReduce())

with strategy.scope():
    optimizer = RAdam(learning_rate=1e-3)
    model.compile(optimizer=optimizer, ... run_eagerly=False)
    model.fit(train_dataset)

Try changing this line

keras-radam/keras_radam/optimizer_v2.py

Line 65 in 7bfd0c0

self._set_hyper('total_steps', total_steps)

to

self._set_hyper('total_steps', float(total_steps))

?

@CyberZHG Thanks. It works.
But it seems much slower than native optimizers.

I think it's because the optimizer is implemented with pure python (compares to c++ & cuda).

my bad. Issues in my code might be the culprit of performance drop.

@CyberZHG Compared to native SGD optimizer, same training time consumption per epoch is observed. Convergence is however, much much faster. Thanks!

I've made a new release for this issue. You can upgrade to 0.7.0 if you're using pip.