Time window for coco dataset training and inference and time window and dataloader use for GEN1
gwgknudayanga opened this issue · 10 comments
-
coco.yaml is missing
-
I just run the code for coco dataset. it seems the time window is always 1. According to your mem_update logic in this case it doesn't experience any voltage membrane decay and it just send the value of the received data clamp to 4 as maximum. So what is the time window that you have used to make it behave like spiking neurons?
-
Even with that running i am getting following error when running.
File "/home/atiye/SpikeYOLO/ultralytics/models/yolo/detect/val.py", line 146, in get_stats
self.nt_per_class = np.bincount(stats[-1].astype(int), minlength=self.nc) # number of targets per class
IndexError: list index out of range -
Another question is, how many time bins/time window you have used to organize the GEN1 dvs data? Can you please provide the code dataloader code for GEN1 as then the spiking neurons shows the membrane decay based real spiking neuron behavior?
-
Also since mem_update is used for both train and validation it cannot see how binary spikes (1,0) are available during inference as mentioned in paper
1,3. We re-uploaded coco.yaml,you can try it again
2. you can change the time window in cfg/model/snn_yolov8.yaml [MS_GetT]
4. Time window of Gen1 is same as the paper reported. Gen1 dataset will upload a few weeks later
5. There is no function in the code to convert an integer into a pulse train, as this does not provide inference acceleration or power consumption on the GPU. But it's easy to implement, for example, you just need to turn a 3 into 3 1s
@XinhaoLuo666,
Regarding the GEN1 dataset evaluation, I organize each sample of the dataset to 5 time steps. each time step has 2 channels. And then i feed these to the network and trained your network . But the maximum mAP@50 that i could obtained in validation set is 36.1%. (I used horizontal,vertical as data augmentations. ). Don't know why i am getting these low values compared to the reported values in the paper.
For different T's, do you guarantee that their data-enhanced images are the same? Assuming that the image remains as it is in the first frame and the third image is flipped left and right, the model's ability to model the timing task will be significantly damaged
@XinhaoLuo666
In GEN1 dataset there are no images and it's only the events, isn't it? I feed the direct events of a sample organizing them into a tensor of dimension (5,16,2,640,640) which are (T,N,C,H,W). So any augmentation happens to the whole tensor at once and hence similar to all 5 time steps in similar manner.
I don't why the mAP@50 is 36.1% even after 200 iterations. It is better you can upload the code for GEN1 dataloading with guidelines/READme and also with sample weights file so that it is easy to reproduce the results and proced on top of it.
Thank you.
GEN1数据集如何使用呀,有没有哥们教一教我!
Hi @XinhaoLuo666,
Is the class for preprocessing and loading of GEN1 event data to SpikeYOLO available to download now?
It doesn't seems yet. Also if you resized the GEN1 data spatially before feeding to your network, what was the resized resolution?
Hi @XinhaoLuo666,
Is the class for preprocessing and loading of GEN1 event data to SpikeYOLO available to download now? It doesn't seems yet. Also if you resized the GEN1 data spatially before feeding to your network, what was the resized resolution?
We have uploaded a new folder "SpikeYOLO_for_Gen1" that can be used for training and inference directly on the GEN1 dataset
GEN1数据集如何使用呀,有没有哥们教一教我!
We have uploaded a new folder "SpikeYOLO_for_Gen1" that can be used for training and inference directly on the GEN1 dataset
Thank you. I just had a look. It seems even for the GEN1 data , the same augmentations for image-based data are used.
For example as a part of v8_transform used in spikeyolo for GEN1, it uses RandomPerspective tranformation. Within that it can be seen opencv APIs with appending values like 114 to binary spike data. I thought to use opencv API it is just for image data. And also adding 114 values for padding is for binary spike data might cause issues. So i believe you use these same augmentations for image for events without any issue right? Or am i missing something?
Thank you. I just had a look. It seems even for the GEN1 data , the same augmentations for image-based data are used. For example as a part of v8_transform used in spikeyolo for GEN1, it uses RandomPerspective tranformation. Within that it can be seen opencv APIs with appending values like 114 to binary spike data. I thought to use opencv API it is just for image data. And also adding 114 values for padding is for binary spike data might cause issues. So i believe you use these same augmentations for image for events without any issue right? Or am i missing something?
No, we didn't use augmentation when working with neuromorphic data, he did execute the "RandomPerspective tranformation" function, but this function has been internally altered to remove the augmentation part