clovaai/tunit

FFHQ results

Closed this issue · 1 comments

Hello :)

I have a question about figure 4. in this paper.

In the paper, the ffhq results were generated by using averaged style vectors of each domain.

However, I checked that the style vector is generated globally. There is only one style vector that is shared by all the domains.

Could you explain about "averaged style vectors of each domain"?

Thank you :)

If it is right for referring to "Figure 4: Cross-domain attribute translation using 0.1% of labeled samples",
I manually labeled some images for each attribute (glasses, aging, gender, hair color), then, trained TUNIT in a semi-supervised manner.

For "glasses" as an example, I labeled 35 images as "glasses" and another 35 images as "no glasses".
Then, there are two domains (glasses & no glasses), therefore, we can train TUNIT like a cross-domain translation model in the semi-supervised learning with an extremely small number of labeled samples.

Because it becomes a sort of a cross-domain translation problem, we have two domains. So we can calculate the average vector of each domain and conduct the image translation using the average vector.

Also, please note that the result of FFHQ in Figure 5. is different from that of Figure 4.
(Figure 5. : unsupervised / Figure 4. : semi-supervised)


Actually, you can conduct the translation with the average style vector of TUNIT trained in an unsupervised manner. Because there are K domains constructed by the clustering, then we can compute the avg. vector of each domain and use it for translating images.