Pinned Repositories
DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
DeepSpeed-MII
MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
fast-llama
Runs LLaMA with Extremely HIGH speed
ftl
C++ Fast Template Libraries
KVQuant
KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization
Latte
The official implementation of Latte: Latent Diffusion Transformer for Video Generation.
learn_dl
Deep learning algorithms source code for beginners
llm-inference-acceleration-handbook
LookaheadDecoding
rapid-llama-src
CoderLSF's Repositories
CoderLSF/fast-llama
Runs LLaMA with Extremely HIGH speed
CoderLSF/llm-inference-acceleration-handbook
CoderLSF/LookaheadDecoding
CoderLSF/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
CoderLSF/DeepSpeed-MII
MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
CoderLSF/ftl
C++ Fast Template Libraries
CoderLSF/KVQuant
KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization
CoderLSF/Latte
The official implementation of Latte: Latent Diffusion Transformer for Video Generation.
CoderLSF/learn_dl
Deep learning algorithms source code for beginners
CoderLSF/rapid-llama-src