Compute_plane_seg_loss
Closed this issue · 4 comments
Hi @fuy34 , thank you for sharing the code. I have one question related to compute_plane_seg_loss
at here, especially line 154-158 for calculating plane_mask
. Please correct me if I am wrong.
(1) For line 152, pred = pred_in - tf.reduce_max(pred_in, axis=-1,keep_dims=True)
is to keep pred
in [-inf,0] for stable number, since exp(pred) may cause numerical issue if pred is too large.
(2) However, I can see that line 154, you have # plane_mask = tf.reduce_logsumexp(pred_plane_only, axis=-1) - tf.reduce_logsumexp(pred, axis=-1)
which I can understand since it is calculating the probability of plane. However, I see that you comment out this line and use line 155-158 for calculating plane_mask
.
I try to understand the equation. It seems to me that, you try to do plane_mask = tf.reduce_logsumexp(pred_plane_only-pred_plane_only_max, axis=-1) - tf.reduce_logsumexp(pred-pred_plane_only_max, axis=-1)
but I don't get the reason for that.
Could you explain the reason for this operation?
Hi Huangying, thanks for your interest in our work.
Your understanding is correct.
The line 155-158 is actually a reimplement for the commented line for numerical stability. The idea is as same as line 152. If we look at the first 2 terms in the line 157 (the one in '[]' below)
plane_mask =[tf.reduce_logsumexp(pred_plane_only - pred_plane_only_max, axis=-1,keep_dims=True) + pred_plane_only_max
] - tf.reduce_logsumexp(pred, axis=-1,keep_dims=True),
it should be the same as tf.reduce_logsumexp(pred_plane_only, axis=-1)
.
Hi @fuy34 , yes, you are right that [tf.reduce_logsumexp(pred_plane_only - pred_plane_only_max, axis=-1,keep_dims=True) + pred_plane_only_max ]
is as same as [tf.reduce_logsumexp(pred_plane_only - pred_plane_only_max+pred_plane_only_max, axis=-1,keep_dims=True)
and thus
tf.reduce_logsumexp(pred_plane_only, axis=-1)
.
However, could you explain more why original implementation at line 154 has a numerical issue? I still haven't seen the reason line 154 will cause a numerical issue.
In most cases, the commented line could work fine.
The intention for the reimplementation is the pred_plane_only
is a part of the pred
. If there is a case that the final row pred[:, :, :, -1:]
has a very large number, then after line 152 some elements of pred_plane_only
could be a very small number. Then exp(pred_plane_only)
could be almost 0, leading to log(exp(.))
-inf. While if we do pred_plane_only - pred_plane_only_max
first, the value could go closer to 0 (here pred_plane_only_max
is a negative number), so exp(.)
will be closer to 1, therefore we can avoid the tf.reduce_logsumexp(pred_plane_only)
become -inf.
@fuy34 Thanks! It is clear to me now.