Boosting DL Service Throughput 1.5-4x by Ensemble Pipeline Serving with Concurrent CUDA Streams for PyTorch/LibTorch Frontend and TensorRT/CVCUDA, etc., Backends
Primary LanguageC++Apache License 2.0Apache-2.0
No one’s star this repository yet.