Negative Loss for Pointpillar_MIMO_Var_C

Question

Negative Loss for Pointpillar_MIMO_Var_C

Opened this issue a year ago · 3 comments

I tried to train a pointpillar_mimo_var_c model with following changes:

Without docker container, because my server use CUDA 11.3, pytorch 12.1
Did not use road_plane
use kitti dataset
random seed 3

and it rapidly reached negative loss.

The train parameters are default as follow:

      LOSS_CONFIG:
           CLF_LOSS_TYPE: SoftmaxFocalLossV2
           REG_LOSS_TYPE: VarRegLoss
           LOSS_WEIGHTS: {
               'cls_weight': 1.0,
               'loc_weight': 2.0,
               'dir_weight': 0.2,
               'code_weights': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0],
               'loc_l1_weight': 1.0,
               'loc_var_weight': 0.05
           }

   POST_PROCESSING:
       RECALL_THRESH_LIST: [0.3, 0.5, 0.7]
       SCORE_THRESH: 0.1
       OUTPUT_RAW_SCORE: False

       EVAL_METRIC: kitti

       NMS_CONFIG:
           MULTI_CLASSES_NMS: False
           NMS_TYPE: nms_gpu
           NMS_THRESH: 0.01
           NMS_PRE_MAXSIZE: 4096
           NMS_POST_MAXSIZE: 500


OPTIMIZATION:
   BATCH_SIZE_PER_GPU: 4
   NUM_EPOCHS: 80

   OPTIMIZER: adam_onecycle
   LR: 0.003
   WEIGHT_DECAY: 0.01
   MOMENTUM: 0.9

   MOMS: [0.95, 0.85]
   PCT_START: 0.4
   DIV_FACTOR: 10
   DECAY_STEP_LIST: [35, 45]
   LR_DECAY: 0.1
   LR_CLIP: 0.0000001

   LR_WARMUP: False
   WARMUP_EPOCH: 1

   GRAD_NORM_CLIP: 10

Which I believe is exactly the same as the demo code and config files provided.
Could you please help me?
I wonder what parameters are proper and what did you use in the experiment in IEEE paper relevant.
I wish you could spend a while to check this problem.

Answer 1 · 2023-03-24T23:06:42.000Z

The variance losses are negative which made the total loss negative for me. Do you get similar results to this?

Answer 2 · 2023-03-24T23:27:53.000Z

The variance losses are negative which made the total loss negative for me. Do you get similar results to this?

So it's quite normal that the total loss is negative? Well, negative variance loss is really out of my expectation, so I have never trained more than 15 epochs with MIMO_var_C. I will check the loss curves after training and keep you informed if not bothering you.
I am actually not familiar with this area, and I am still curious about why a negative loss works during neural network optimization.

Thanks for replying to me so immediately. Your MIMO is an amazing work in uncertainty evaluation on 3D detection which helps me a lot. I really appreacite that you publish your code.

Answer 3 · 2023-03-24T23:43:24.000Z

Yes, it is normal. If you plug the equations into wolfram alpha and enter some reasonable values you can see that the result is most likely negative. You could also just output some values where I implemented the loss functions to check the implementation.

I'm not too sure how this affects the optimization. I remember many papers that had these loss functions for 3D object detection were closed-source. Maybe they found a different way to implement it and have a positive loss?