Limiting attention radius and extracting embeddings
george-henderson opened this issue · 1 comments
Hello,
Is it possible to alter the model's attention radius, such that the model only applies attention within a certain window in the input?
A second question: Can you instruct on how I might extract the embeddings from the model? I am using the model out of the box, such that the final output is a tensor of dimension num_batches x input length x num_tokens, but I’d like to access the internal latent space representation of my text as well.
Thank you!
Is it possible to alter the model's attention radius, such that the model only applies attention within a certain window in the input?
This is not possible with the current model
A second question: Can you instruct on how I might extract the embeddings from the model? I am using the model out of the box, such that the final output is a tensor of dimension num_batches x input length x num_tokens, but I’d like to access the internal latent space representation of my text as well.
Here's an example of extracting the last hidden layer outputs from the model: #32