A question about frame construction

Question

A question about frame construction

Closed this issue 8 months ago · 4 comments

Hi @magehrig
Recently I noted that the framing pattern in your code has been just like this:

The channels are first divided by the polarity and then divided by the time interval.

However, in my impression, it seems the following method is more common:

In contrast, the channels are first divided by the time interval and then divided by the polarity.

Therefore, I train these two framing methods in RVT-B for GEN1, and I get the results below:

        if sort_mode == 'POL_SORT':
            indices = x.long() + \
                    wd * y.long() + \
                    ht * wd * t_idx.long() + \
                    bn * ht * wd * pol.long()
        elif sort_mode == 'TIME_SORT':
            indices = x.long() + \
                    wd * y.long() + \
                    ht * wd * pol.long() +\
                    ht * wd * ch * t_idx.long()

At first, I thought those two ways would get a similar performance. Surprisingly, your framing method achieves better performance!
I was wondering if you had tested these two framing methods before and used this framing scheme because the former achieved better results. Or do you know why the previous solution of sorting by polarity gives a better result?

Answer 1 · 2024-01-25T12:44:35.000Z

Conceptually it should not make a difference because the first layer is a 2D conv so the order of the channels does not matter. However, if you made the changes mentioned above, you should probably also change the following line from

        representation = th.zeros((self.channels, self.bins, self.height, self.width),
                                  dtype=dtype, device=device, requires_grad=False)

to

        representation = th.zeros((self.bins, self.channels, self.height, self.width),
                                  dtype=dtype, device=device, requires_grad=False)

Answer 2 · 2024-01-25T12:55:07.000Z

Hi @magehrig
Thank you for your reply.
The above results are so strange, at first I didn't change the following line:

representation = th.zeros((self.bins, self.channels, self.height, self.width),
                                  dtype=dtype, device=device, requires_grad=False)

Because I think there will be the same after the final reshape.

But after I found this result

I changed this line as you mentioned, However, I got the similar results in a sub-set gen1.

It's so strange, that I may need to repeat the experiments /(ㄒoㄒ)/~~.

Answer 3 · 2024-01-26T17:59:09.000Z

great, let me know if that fixes your issue

Answer 4 · 2024-01-29T12:20:44.000Z

OK, I will test it this time.