google-research/deeplab2

Unstable numeric output for downstream task (moat 4 w/o pos)

edwardyehuang opened this issue · 1 comments

ckpts : moat4 w/o pos

The output from moat4 can easily result in the following layers (e.g., 3x3 conv) having a NaN output.

The same issue, at least, does not show in moat0.

It is the first time I have met this issue in my career (I met NaN many times, but never like this), so I need to take some time to investigate this issue.

I will update this issue if I have a new finding. Please also check if the provided ckpts are working.

Thanks for your interest. For this issue, I have two suggestions:

  1. First of all, make sure the weights of all layers are correctly loaded, including the stem layers, etc.
  2. You can adjust the learning rate, drop path rate and batch size to solve the issue.