Incomplete explanation
lix19937 opened this issue · 1 comments
lix19937 commented
Branch/Tag/Commit
v5.3_tag
Docker Image Version
22.08
GPU name
RTX 3070
CUDA Driver
470.129.06
Reproduced Steps
https://github.com/NVIDIA/FasterTransformer/blob/release/v5.3_tag/src/fastertransformer/layers/attention_layers/FusedAttentionLayer.h#L28
// This class is only used when we satisfy the following conditions:
// 1. FP16
// 2. Temporally add seqlen <= 512 limitation because the
template<typename T>
Incomplete explanation
lix19937 commented
dup