CFC87's Stars
donnemartin/system-design-primer
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
Significant-Gravitas/AutoGPT
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
langchain-ai/langchain
🦜🔗 Build context-aware reasoning applications
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
facebookresearch/codellama
Inference code for CodeLlama models
stas00/ml-engineering
Machine Learning Engineering Open Book
huggingface/trl
Train transformer language models with reinforcement learning.
FMInference/FlexGen
Running large language models on a single GPU for throughput-oriented scenarios.
skypilot-org/skypilot
SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 12+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.
NVIDIA/cuda-samples
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
huggingface/alignment-handbook
Robust recipes to align language models with human and AI preferences
ray-project/llm-numbers
Numbers every LLM developer should know
kelvins/awesome-mlops
:sunglasses: A curated list of awesome MLOps tools
tensorchord/Awesome-LLMOps
An awesome & curated list of best LLMOps tools for developers
alpa-projects/alpa
Training and serving large-scale neural networks with auto parallelization.
huggingface/optimum
🚀 Accelerate training and inference of 🤗 Transformers and 🤗 Diffusers with easy to use hardware optimization tools
merrymercy/awesome-tensor-compilers
A list of awesome compiler projects and papers for tensor computation and deep learning.
microsoft/Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
ray-project/llm-applications
A comprehensive guide to building RAG-based LLM applications for production.
flexflow/FlexFlow
FlexFlow Serve: Low-Latency, High-Performance LLM Serving
CStanKonrad/long_llama
LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA and fine-tuned with the Focused Transformer (FoT) method.
skyplane-project/skyplane
🔥 Blazing fast bulk data transfers between any cloud 🔥
corca-ai/awesome-llm-security
A curation of awesome tools, documents and projects about LLM Security.
abacusai/Long-Context
This repository contains code and tooling for the Abacus.AI LLM Context Expansion project. Also included are evaluation scripts and benchmark tasks that evaluate a model’s information retrieval capabilities with context expansion. We also include key experimental results and instructions for reproducing and building on them.
DachengLi1/LongChat
Official repository for LongChat and LongEval
AmadeusChan/Awesome-LLM-System-Papers
SJTU-SE/awesome-se-notes
Notes for courses of @SJTU-SE
kyegomez/FlashAttention20
Get down and dirty with FlashAttention2.0 in pytorch, plug in and play no complex CUDA kernels
cosmoss-jigu/memtis
Tiered memory management
princeton-nlp/WhatICLLearns
[ACL 2023 Findings] What In-Context Learning “Learns” In-Context: Disentangling Task Recognition and Task Learning