preddy5/Im2Vec

emoji dataset is settled to 4 parts, and every parts is filled with a constant color, does this mean im2vec can only deal with

Closed this issue · 5 comments

Great job first. But after learn and run your code for several weeks, I found that Emoji dataset is settled to 4 parts in your code, and every parts is filled with a constant color and a certain path in non-verbose mode. does this mean im2vec can only deal with very simple images and must pre-set how many parts the image have and which certain color each part use?

each single image in emoji dataset is seperated into 4 parts. and the number 4 is pre-setted

In the im2vec paper, the authors have provided results for more complex shapes. In Section 3.4 of the paper, they have mentioned that for a given training dataset, they segment the samples and cluster the segments based on the spatial position by assigning a different color to each cluster. Perhaps for the emoji dataset, they have used 4 clusters. It appears from the paper that the number of segments/clusters varies with the complexity of the topology of the samples in different dataset, but it may be fixed for a given dataset. Perhaps im2vec can be trained by combining different datasets (for e.g. emojis and icon dataset).

Hey @eveybody2
Thank you, yes you are right we use 4 parts in the code and every part is filled with a constant color. This helps the network converge in a more stable manner. When there are multiple paths with the same color the gradients can get noisy.
Yes also as you observed such coloring preprocessing would mean that it would be hard for the method to handle complex shapes with numerous paths. While the method can work with a few complex shapes out of the box, getting it to work with complex shapes consistently is still an open problem.

Regards,
Pradyumna.

Thanks a lot. I think the Fundamental problem is how to make color differentiable in differentiable rasterization. must find a robust and stable way to bp the gradients for variable color, if our ultimate goal is to vectorize natural image in deep learning.

or maybe it's proved to be Unsolvable in xml based representation like svg. Then we'll need to invent a new representation for vector graph.