S-Lab-System-Group/Awesome-DL-Scheduling-Papers

OSDI'22

Opened this issue · 0 comments

Achieving μs-scale Preemption for Concurrent GPU-accelerated DNN Inferences
该论文提出面向DNN推理任务的强实时高并发GPU调度技术,支持对非实时任务的百微秒级抢占,可将系统整体吞吐量提高了1.1~4.3倍。