microsoft/mutransformers

Question: Shouldn't learnable positional embeddings be MuReadoutLayers ?

codedecde opened this issue · 0 comments

Hi !
I was wondering if (learned) positional embeddings should be MuReadout layers (

self.position_embeddings = nn.Embedding(config.max_position_embeddings, config.hidden_size)
), since they map to a finite dimensional space ?

Would be grateful for any advice :)

Thank you !