Pinned Repositories
assign3
assignment5
assignment6
assignment7
de1star.github.io
former_face_figure
msccl
Microsoft Collective Communication Library
msccl
Microsoft Collective Communication Library
TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
TensorRT-Model-Optimizer
TensorRT Model Optimizer is a unified library of state-of-the-art model optimization techniques such as quantization, pruning, distillation, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM or TensorRT to optimize inference speed on NVIDIA GPUs.
de1star's Repositories
de1star/assign3
de1star/assignment5
de1star/assignment6
de1star/assignment7
de1star/de1star.github.io
de1star/former_face_figure
de1star/msccl
Microsoft Collective Communication Library