NVlabs/ssn_superpixels

Question of the function of spix_init

yueyu-stu opened this issue · 2 comments

Hi, thanks for your work.
I have a question about spix_init in create_ssn.py. It seems to be an unchangeable constant each time it is called (e.g. in Passoc_layer.cu and spixel_feature2_layer.cu, it is a const Dtype* and named index_data).
In my comprehension, spix_init provides the surrounding valid labels of corresponding superpxiel when accessing one pixel. However, in each iteration, the label of each pixel may be changed due to calculating the distance between it and surrounding clustering centers, meaning this pixel may not belong to the initial superpixel after several iterations.
For instance (maybe it’s not true and rational), a pixel belongs to superpixel #57, and the surrounding superpixel are #46, #47, #48, #56, #58, #66, #67 and #68. After several iterations, this pixel may belong to #56, and the surrounding superpixels of #57 may turn to other 8 superpixels - maybe one is #69. But as index is fixed, its label is fixed, meaning that this pixel always belongs to superpixel #57 (in spixel_feature2_layer.cu). Besides, the surrouding superpixels' labels are fixed. So I cannot understand that and think it may not be able to show the process of clustering.
Thanks for reading this redundant description and looking forward to your comprehension of spix_init.

That's a good observation. Yes, that is correct. Neighborhood superpixel indices are fixed and may not be optimal across iterations. We do this for efficiency purposes and also to make iterations differentiable. There is an inherent assumption here that the superpixels move around together and also a pixel can only belong to one of the nine surrounding superpixels. I tried using bigger neighborhood size 5x5 (25 superpixels) instead of 3x3 (9 superpixels) and did not observe any significant improvement in performance. So, in practice, using a fixed set of surrounding superpixels indices for each pixel works fine. Conceptually, it is better to dynamically update the neighborhood indices and doing this in a differentiable fashion would be technically interesting.

Yes, that makes sense. Thank you!