akaitsuki-ii

@Microsoft @ByteDanceGuangzhou, Suzhou

akaitsuki-ii's Stars

conda-forge/miniforge
A conda-forge distribution.
Language:Shell6.2k323
karpathy/llm.c
LLM training in simple, raw C/CUDA
Language:Cuda23.6k2.6k
karpathy/llama2.c
Inference Llama 2 in one file of pure C
Language:C17.2k2.1k
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
Language:C++8.3k932
liguodongiot/llm-action
本项目旨在分享大模型相关技术原理以及实战经验。
Language:HTML9.4k918
microsoft/LLMLingua
To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.
Language:Python4.5k251
BBuf/how-to-optim-algorithm-in-cuda
how to optimize some algorithm in cuda.
Language:Cuda1.5k122
microsoft/chunk-attention
Language:Python356
triton-lang/triton
Development repository for the Triton language and compiler
Language:C++12.9k1.6k
ModelTC/lightllm
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
Language:Python2.4k194
microsoft/Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
Language:Python1.9k342
NVIDIA/Megatron-LM
Ongoing research training transformer models at scale
Language:Python10.2k2.3k
huggingface/accelerate
🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
Language:Python7.8k941
huggingface/peft
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
Language:Python16k1.6k
LargeWorldModel/LWM
Language:Python7.1k549
allenai/OLMo
Modeling, training, eval, and inference code for OLMo
Language:Python4.5k446
brexhq/prompt-engineering
Tips and tricks for working with Large Language Models like OpenAI's GPT-4.
8.4k386
AutoGPTQ/AutoGPTQ
An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
Language:Python4.4k467
dvmazur/mixtral-offloading
Run Mixtral-8x7B models in Colab or consumer desktops
Language:Python2.3k225
federico-busato/Modern-CPP-Programming
Modern C++ Programming Course (C++03/11/14/17/20/23/26)
Language:HTML11.9k797
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
Language:Python13.6k1.2k
lyogavin/airllm
AirLLM 70B inference with single 4GB GPU
Language:Jupyter Notebook4.5k359
ByteByteGoHq/system-design-101
Explain complex systems using visuals and simple terms. Help you prepare for system design interviews.
63.4k6.6k
ml-explore/mlx
MLX: An array framework for Apple silicon
Language:C++16.6k953
THUDM/ChatGLM3
ChatGLM3 series: Open Bilingual Chat LLMs | 开源双语对话语言模型
Language:Python13.4k1.6k
noamgat/lm-format-enforcer
Enforce the output format (JSON Schema, Regex etc) of a language model
Language:Python1.4k65
Significant-Gravitas/AutoGPT
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
Language:Python167k44.2k
cpacker/MemGPT
Letta (fka MemGPT) is a framework for creating stateful LLM services.
Language:Python11.9k1.3k
microsoft/autogen
A programming framework for agentic AI 🤖
Language:Jupyter Notebook31.5k4.6k
langchain-ai/langchain
🦜🔗 Build context-aware reasoning applications
Language:Jupyter Notebook93k14.9k

akaitsuki-ii

akaitsuki-ii's Stars

conda-forge/miniforge

karpathy/llm.c

karpathy/llama2.c

NVIDIA/TensorRT-LLM

liguodongiot/llm-action

microsoft/LLMLingua

BBuf/how-to-optim-algorithm-in-cuda

microsoft/chunk-attention

triton-lang/triton

ModelTC/lightllm

microsoft/Megatron-DeepSpeed

NVIDIA/Megatron-LM

huggingface/accelerate

huggingface/peft

LargeWorldModel/LWM

allenai/OLMo

brexhq/prompt-engineering

AutoGPTQ/AutoGPTQ

dvmazur/mixtral-offloading

federico-busato/Modern-CPP-Programming

Dao-AILab/flash-attention

lyogavin/airllm

ByteByteGoHq/system-design-101

ml-explore/mlx

THUDM/ChatGLM3

noamgat/lm-format-enforcer

Significant-Gravitas/AutoGPT

cpacker/MemGPT

microsoft/autogen

langchain-ai/langchain