cody-moveworks's Stars
Significant-Gravitas/AutoGPT
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
huggingface/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
openai/whisper
Robust Speech Recognition via Large-Scale Weak Supervision
wagoodman/dive
A tool for exploring each layer in a docker image
microsoft/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
huggingface/datasets
🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
guidance-ai/guidance
A guidance language for controlling large language models.
onnx/onnx
Open standard for machine learning interoperability
huggingface/peft
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
jadore801120/attention-is-all-you-need-pytorch
A PyTorch implementation of the Transformer model in "Attention is All You Need".
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
triton-inference-server/server
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
huggingface/accelerate
🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
NVIDIA/FasterTransformer
Transformer related optimization, including BERT, GPT
TimDettmers/bitsandbytes
Accessible large language models via k-bit quantization for PyTorch.
jcjohnson/pytorch-examples
Simple examples to introduce PyTorch
InternLM/lmdeploy
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
kelvins/awesome-mlops
:sunglasses: A curated list of awesome MLOps tools
adapter-hub/adapters
A Unified Library for Parameter-Efficient and Modular Transfer Learning
huggingface/optimum
🚀 Accelerate training and inference of 🤗 Transformers and 🤗 Diffusers with easy to use hardware optimization tools
VertaAI/modeldb
Open Source ML Model Versioning, Metadata, and Experiment Management
flexflow/FlexFlow
FlexFlow Serve: Low-Latency, High-Performance LLM Serving
mit-han-lab/smoothquant
[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
brainsik/virtualenv-burrito
One command to have a working virtualenv + virtualenvwrapper environment.
feifeibear/LLMSpeculativeSampling
Fast inference from large lauguage models via speculative decoding
idiap/importance-sampling
Code for experiments regarding importance sampling for training neural networks
xuyxu/Soft-Decision-Tree
PyTorch Implementation of "Distilling a Neural Network Into a Soft Decision Tree." Nicholas Frosst, Geoffrey Hinton., 2017.
shreyansh26/Speculative-Sampling
Implementation of Speculative Sampling as described in "Accelerating Large Language Model Decoding with Speculative Sampling" by Deepmind