Pinned Repositories
FasterTransformer
Transformer related optimization, including BERT, GPT
FasterTransformer
Transformer related optimization, including BERT, GPT
server
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
kjtaed's Repositories
kjtaed/FasterTransformer
Transformer related optimization, including BERT, GPT