Viredery/tf-eager-fasterrcnn

Training losses all nan values

Opened this issue · 1 comments

LMD93 commented

Hello, I tried running the jupyter notebook script as it is for training of the model. The only change I made was to scale under train_dataset.

train_dataset = coco.CocoDataSet(
    "./COCO2017/",
    "val",
    flip_ratio=0.5,
    pad_mode="fixed",
    mean=img_mean,
    std=img_std,
    scale=(256, 512),
)

I printed out the individual losses and this is what I see.

rpn_class_loss  tf.Tensor(nan, shape=(), dtype=float32)
rpn_bbox_loss  tf.Tensor(nan, shape=(), dtype=float32)
rcnn_class_loss  tf.Tensor(0.0, shape=(), dtype=float32)
rcnn_bbox_loss  tf.Tensor(nan, shape=(), dtype=float32)

There is no error thrown, and I did not make any changes to any of the scripts. Any idea why this is happening? Thanks!

I am not sure what the problem is. I tried it and it ran successfully. Could you please send your program logs or screenshots?