Has the switching self-attention been applied to all stages?
ziyaxuanyi opened this issue · 2 comments
ziyaxuanyi commented
The paper said the switching self-attention should only be applied to late upblocks of Unet to achieve the best results.
But this codes seem that the switching self-attention is applied to all stages.
ExponentialML commented
This seems to be the case! Completely overlooked it.
ExponentialML commented
I'm going to close this as this functionality has been implemented.
Please feel free to ping again if there are any concerns that relate to this issue.