Yang7879/3D-BoNet

Error in Hungarian: matrix contains invalid numeric entries

Opened this issue · 7 comments

I am training a dental pointcloud dataset that has already been tested on ASIS with quite good results, but now testing on 3D-BoNet and having used the recommended h5 files generation script the algorithm crashes after few iterations:

  File "/home/ubuntu/3D-BoNet/helper_net.py", line 120, in assign_mappings_valid_only
    row_ind, col_ind=linear_sum_assignment(valid_cost)

  File "/home/ubuntu/miniconda3/envs/tf_gpu114_p36_source/lib/python3.6/site-packages/scipy/optimize/_hungarian.py", line 93, in linear_sum_assignment
    raise ValueError("matrix contains invalid numeric entries")

ValueError: matrix contains invalid numeric entries

Have you any idea for why could this be happening? A feature of the dataset is that each block has very few instances, and that every mandible (room) has an instance present in all blocks (mouth gum).

hi @masotrix, you may double-check whether all blocks (i.e., each individual point cloud u feed into the network) contain at least a valid instance. If there is no valid instance, you need to simply skip that point cloud for training, otherwise, the association algorithm is unable to process.

Hi @Yang7879 , thanks, what you say is probably the issue.

  1. How is defined a valid instance?
  2. "Invalid blocks" should be skiped from addition to h5 files or can be skiped inside 3D-BoNet?
  3. Would instances that does not fit in blocks have a much worse final segmentation?

Hi @masotrix , for each point cloud (say 4096 points), there is at least one instance inside, note that an instance may not necessarily be a complete object, it could be a part of an object if you happen to cut the object. For each above instance (say 400 pts), there should be only one category label for those 400 pts.

This function may be helpful:

def load_full_file_list(self, areas):

Hi, I ran into the same problem and the reason is that I normalized blocks into [-1, 1], which may cause the sum of gt-bbox to be less than 0. So, when all gt-bbox in the same batch of samples are less than 0, bbox_loss_l2_pos will be illegally divided by 0, i.e., tf.reduce_sum(Y_bbox_helper) is equal to 0. The solution can be found in #25 .

Hi, I ran into the same problem and the reason is that I normalized blocks into [-1, 1], which may cause the sum of gt-bbox to be less than 0. So, when all gt-bbox in the same batch of samples are less than 0, bbox_loss_l2_pos will be illegally divided by 0, i.e., tf.reduce_sum(Y_bbox_helper) is equal to 0. The solution can be found in #25 .

Hi @HiphonL
I used your solution and added an abs operation for s3dis dataset. However it resulted in the same error.

Hi, I ran into the same problem and the reason is that I normalized blocks into [-1, 1], which may cause the sum of gt-bbox to be less than 0. So, when all gt-bbox in the same batch of samples are less than 0, bbox_loss_l2_pos will be illegally divided by 0, i.e., tf.reduce_sum(Y_bbox_helper) is equal to 0. The solution can be found in #25 .

Hi @HiphonL I used your solution and added an abs operation for s3dis dataset. However it resulted in the same error.

请问您解决这个问题了吗

请问问题解决了吗