/ByteTransformer

optimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052

Primary LanguageC++Apache License 2.0Apache-2.0

No issues in this repository yet.