Somoku

An undergraduate student from Peking University.

Peking UniversityBeijing

Somoku's Stars

hiyouga/LLaMA-Factory
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
Language:Python37.7k 220 5.7k4.6k
meta-llama/llama3
The official Meta Llama 3 GitHub site
Language:Python27.9k 234 2743.2k
richards199999/Thinking-Claude
Let your Claude able to think
Language:TypeScript13.2k 98 271.6k
NVIDIA/NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Language:Python12.8k 212 2.4k2.6k
federico-busato/Modern-CPP-Programming
Modern C++ Programming Course (C++03/11/14/17/20/23/26)
Language:HTML12.6k 147 122850
axolotl-ai-cloud/axolotl
Go ahead and axolotl questions
Language:Python8.3k 45 716911
modelscope/ms-swift
Use PEFT or Full-parameter to finetune 400+ LLMs (Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, ...) or 150+ MLLMs (Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, InternVL2.5, MiniCPM-V-2.6, GLM4v, Xcomposer2.5, Yi-VL, DeepSeek-VL2, Phi3.5-Vision, GOT-OCR2, ...).
Language:Python5k 23 1.5k433
linkedin/Liger-Kernel
Efficient Triton Kernels for LLM Training
Language:Python4.1k 43 147239
sksq96/pytorch-summary
Model summary in PyTorch similar to `model.summary()` in Keras
Language:Python4k 37 151417
gpu-mode/lectures
Material for gpu-mode lectures
Language:Jupyter Notebook3.4k 51 9347
dvlab-research/LongLoRA
Code and documents of LongLoRA and LongAlpaca (ICLR 2024 Oral)
Language:Python2.7k 13 173278
adapter-hub/adapters
A Unified Library for Parameter-Efficient and Modular Transfer Learning
Language:Jupyter Notebook2.6k 29 397355
NVIDIA/TransformerEngine
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.
Language:Python2.1k 33 371345
HazyResearch/ThunderKittens
Tile primitives for speedy kernels
Language:Cuda1.9k 33 3393
stanfordnlp/pyreft
ReFT: Representation Finetuning for Language Models
Language:Python1.4k 18 97122
AmberLJC/LLMSys-PaperList
Large Language Model (LLM) Systems Paper List
726 29 126
efeslab/Nanoflow
A throughput-oriented high-performance serving framework for LLMs
Language:Cuda690 8 2629
linux-rdma/perftest
Infiniband Verbs Performance Tests
Language:C662 33 105301
AmadeusChan/Awesome-LLM-System-Papers
524 17 122
hahnyuan/LLM-Viewer
Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline model in a user-friendly interface.
Language:Python367 2 1444
S-Lab-System-Group/Awesome-DL-Scheduling-Papers
270 12 833
bytedance/flux
A fast communication-overlapping library for tensor parallelism on GPUs.
Language:C++267 8 2624
THUDM/LongAlign
[EMNLP 2024] LongAlign: A Recipe for Long Context Alignment of LLMs
Language:Python237 8 1215
imoneoi/multipack_sampler
Multipack distributed sampler for fast padding-free training of LLMs
Language:Python182 3 413
Outsider565/LoRA-GA
Language:Jupyter Notebook168 3 208
git-cloner/llama-lora-fine-tuning
llama fine-tuning with lora
Language:Python140 2 1514
Mellanox/gpu_direct_rdma_access
example code for using DC QP for providing RDMA READ and WRITE operations to remote GPU memory
Language:C112 9 528
OpenNLPLab/LASP
Linear Attention Sequence Parallelism (LASP)
Language:Python64 2 02
PKU-DAIR/Hetu-Galvatron
Galvatron is an automatic distributed training system designed for Transformer models, including Large Language Models (LLMs).
Language:Python42 2 45
LLMServe/dLoRA-artifact
Language:Jupyter Notebook15 0 13

Somoku

Somoku's Stars

hiyouga/LLaMA-Factory

meta-llama/llama3

richards199999/Thinking-Claude

NVIDIA/NeMo

federico-busato/Modern-CPP-Programming

axolotl-ai-cloud/axolotl

modelscope/ms-swift

linkedin/Liger-Kernel

sksq96/pytorch-summary

gpu-mode/lectures

dvlab-research/LongLoRA

adapter-hub/adapters

NVIDIA/TransformerEngine

HazyResearch/ThunderKittens

stanfordnlp/pyreft

AmberLJC/LLMSys-PaperList

efeslab/Nanoflow

linux-rdma/perftest

AmadeusChan/Awesome-LLM-System-Papers

hahnyuan/LLM-Viewer

S-Lab-System-Group/Awesome-DL-Scheduling-Papers

bytedance/flux

THUDM/LongAlign

imoneoi/multipack_sampler

Outsider565/LoRA-GA

git-cloner/llama-lora-fine-tuning

Mellanox/gpu_direct_rdma_access

OpenNLPLab/LASP

PKU-DAIR/Hetu-Galvatron

LLMServe/dLoRA-artifact