Zzh-tju/CIoU

Cluster-NMS

buttercutter opened this issue · 7 comments

I am trying to understand Cluster-NMS operations.

The mathematical proof seems a bit complicated to follow and comprehend.

  1. Why C1 does not change values ? In other words, why C1 == X ?

  2. How to obtain b1 ?

  3. Why is it Cn = E x X instead of Cn = E x Cn-1 ?

  1. Matrix C will change at every iteration unless vector b is unchanged.

  2. Vector b is obtained by calculating the column wise maximum on the matrix C and then binarizing. So b=(b1,b2,...,bn) is a 0,1 vector, where 1 denotes preservation and 0 denotes suppression.

  3. Vector b indicates the suppression results of NMS under a certain iteration. So, by left multipling a diagonal matrix E, it is equivalent to do row transformation on the matrix X. This will ignore those current suppressed boxes so that they will not have any effects on the other boxes. (note that X is original IoU matrix.)

Finally, we will get exactly the same results to Original NMS as long as vector b does not change any more.

This will ignore those current suppressed boxes so that they will not have any effects on the other boxes.

How exactly does left multiplying diagonal matrix E achieve this ?

For example, let b=[1 0 0 1 0].

In our paper, the matrix

E=
1 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 1 0
0 0 0 0 0

then do E×X.

In practice, we use

E=
1 1 1 1 1
0 0 0 0 0
0 0 0 0 0
1 1 1 1 1
0 0 0 0 0

then do element-wise multiplication with the upper triangular IoU matrix X.

Why the extra 1 inside the matrix in practice ?

and how do all those iterations converge to the original NMS result ?

A diagonal matrix left multiplies another is equivalent to do row transformation (by Higher Algebra). So in practice, I replace it with element-wise multiplication for simplicity. Because it's faster than matrix multiplication. As for why the result of Cluster-NMS is equal to that of Original NMS, a simple case is provided here https://github.com/Zzh-tju/CIoU#description-of-cluster-nms-and-its-usage

For mathematics detail, kindly refer to our paper.

So in practice, I replace it with element-wise multiplication for simplicity. Because it's faster than matrix multiplication.

I may had missed something, but how is this (matrix in practice) being element-wise multiplication compared to the matrix given in the paper ?