EMA update before quantization

Question

EMA update before quantization

stangelid opened this issue 5 years ago · 5 comments

Hi and thanks for providing a nice and clean implementation of VQ-VAEs :)

While playing around with your code, I noticed that in VectorQuantizerEMA you first perform the EMA update of the codebook counts and embeddings, and then use the updated codebook embeddings as the quantized vectors (and for computing e_latent_loss).

In particular, the order in which you perform operations is:

Nearest neighbour search
EMA updates
Quantization
e_latent_loss computation

Is there a reason why you do the EMA updates before steps 3 and 4? My intuition says that the order should be:

Nearest neighbour search
Quantization
e_latent_loss computation
EMA updates

Looking forward to hearing your thoughts!

Many thanks,
Stefanos

Answer 1 · 2019-11-13T15:03:38.000Z

@stangelid thanks for the issue! I think your intuition is right! I don't remember exactly my notes from the time I implemented this but thinking through it I think your way is correct. If you can send a PR I'll be happy to merge it, else I will put it into my TODO!

Answer 2 · 2019-12-04T13:27:30.000Z

I a question in the same context, can you please provide an explanation to why you apply Laplace smoothing to the cluster sizes _ema_cluster_size. I am having a hard time understanding why (to my knowledge it was not mentioned in the paper) Thanks.

Answer 3 · 2020-04-16T16:18:40.000Z

@yassouali According to my understanding, the laplace smoothing makes sure that no element of _ema_cluster_size will ever be exactly zero. If that ever happened, it would result in division with zero, when updating _ema_w.

I know this is late, but hope it helps :)

Answer 4 · 2020-04-16T17:18:53.000Z

thanks @stangelid perhaps i'll add this explanation in the notebook

Answer 5 · 2020-04-16T17:20:04.000Z

Thank you.