huggingface/open-muse

Token masking strategies

isamu-isozaki opened this issue · 0 comments

This is experimental and not a priority but I wanted to list 2 denoising strategies that seemed interesting to compare to

  1. Noise Mask which was proposed in Paella (here)[https://arxiv.org/pdf/2211.07292v1.pdf]. I think this is very similar but the distribution/sample method for choosing which tokens to mask is different
  2. Random Tokens were proposed (here)[https://arxiv.org/pdf/2206.12351.pdf]. I still need to fully understand it but basically, the whole idea of masking tokens is removed and instead, you start with random tokens from the codebook.

Overall, these are slightly experimental features that I'm still learning but I think it'll be interesting to compare them once we get a chance.