hiyouga's Stars
pytorch/pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
karpathy/LLM101n
LLM101n: Let's build a Storyteller
linkedin/Liger-Kernel
Efficient Triton Kernels for LLM Training
gpt-omni/mini-omni
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
QwenLM/Qwen2-VL
Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
NVlabs/VILA
VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)
facebookresearch/chameleon
Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.
cambrian-mllm/cambrian
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
mistralai/cookbook
pytorch/data
A PyTorch repo for data loading and utilities to be shared by the PyTorch domain libraries.
kvcache-ai/Mooncake
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
facebookresearch/MobileLLM
MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.
multimodal-art-projection/MAP-NEO
pytorch/ao
PyTorch native quantization and sparsity for training and inference
rasbt/LLM-workshop-2024
A 4-hour coding workshop to understand how LLMs are implemented and used
NVlabs/DoRA
[ICML2024 (Oral)] Official PyTorch implementation of DoRA: Weight-Decomposed Low-Rank Adaptation
HazyResearch/m2
Repo for "Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture"
hubertsiuzdak/snac
Multi-Scale Neural Audio Codec (SNAC) compresses audio into discrete codes at a low bitrate
feifeibear/long-context-attention
USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference
intel/auto-round
Advanced Quantization Algorithm for LLMs. This is official implementation of "Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs"
cognitivecomputations/grokadamw
sail-sg/sailor-llm
⚓️ Sailor: Open Language Models for South-East Asia
vwxyzjn/summarize_from_feedback_details
chujiezheng/LLM-Extrapolation
Official repository for paper "Weak-to-Strong Extrapolation Expedites Alignment"
warner-benjamin/optimi
Fast, Modern, Memory Efficient, and Low Precision PyTorch Optimizers
zhiyuanhubj/LongRecipe
LongRecipe: Recipe for Efficient Long Context Generalization in Large Language Models
VITA-Group/WeLore
From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients. Ajay Jaiswal, Lu Yin, Zhenyu Zhang, Shiwei Liu, Jiawei Zhao, Yuandong Tian, Zhangyang Wang
SeunghyunSEO/optimized_hf_llama_class_for_training
aws-samples/Easy_Fintune_LLM_using_SageMaker_with_LLama_Factory
BUAADreamer/Qwen2-VL-History
Qwen2-VL在文旅领域的LLaMA-Factory微调案例 The case for fine-tuning Qwen2-VL in the field of historical literature and museums