hiyouga

No code All live

Millennium Science SchoolBeijing, China

hiyouga's Stars

pytorch/pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Language:Python82.4k 1.7k 45.2k22.2k
karpathy/LLM101n
LLM101n: Let's build a Storyteller
28.8k 2.2k 01.6k
linkedin/Liger-Kernel
Efficient Triton Kernels for LLM Training
Language:Python3k 35 71153
gpt-omni/mini-omni
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
Language:Python2.6k 91 68244
QwenLM/Qwen2-VL
Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Language:Python2.2k 21 187124
NVlabs/VILA
VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)
Language:Python1.8k 27 119145
facebookresearch/chameleon
Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.
Language:Python1.8k 26 46108
cambrian-mllm/cambrian
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
Language:Python1.7k 23 65112
mistralai/cookbook
Language:Jupyter Notebook1.2k 33 10237
pytorch/data
A PyTorch repo for data loading and utilities to be shared by the PyTorch domain libraries.
Language:Python1.1k 36 489149
kvcache-ai/Mooncake
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
1.1k 12 422
facebookresearch/MobileLLM
MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.
Language:Python938 20 849
multimodal-art-projection/MAP-NEO
Language:Python839 10 3478
pytorch/ao
PyTorch native quantization and sparsity for training and inference
Language:Python781 35 19997
rasbt/LLM-workshop-2024
A 4-hour coding workshop to understand how LLMs are implemented and used
Language:Jupyter Notebook618 36 2147
NVlabs/DoRA
[ICML2024 (Oral)] Official PyTorch implementation of DoRA: Weight-Decomposed Low-Rank Adaptation
Language:Python573 9 1633
HazyResearch/m2
Repo for "Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture"
Language:Assembly533 20 3143
hubertsiuzdak/snac
Multi-Scale Neural Audio Codec (SNAC) compresses audio into discrete codes at a low bitrate
Language:Python361 7 2021
feifeibear/long-context-attention
USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference
Language:Python316 4 1620
intel/auto-round
Advanced Quantization Algorithm for LLMs. This is official implementation of "Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs"
Language:Python212 10 4519
cognitivecomputations/grokadamw
Language:Python110 5 46
sail-sg/sailor-llm
⚓️ Sailor: Open Language Models for South-East Asia
Language:Python100 7 18
vwxyzjn/summarize_from_feedback_details
Language:Python100 4 011
chujiezheng/LLM-Extrapolation
Official repository for paper "Weak-to-Strong Extrapolation Expedites Alignment"
Language:Python62 5 12
warner-benjamin/optimi
Fast, Modern, Memory Efficient, and Low Precision PyTorch Optimizers
Language:Python53 2 52
zhiyuanhubj/LongRecipe
LongRecipe: Recipe for Efficient Long Context Generalization in Large Language Models
Language:Python534
VITA-Group/WeLore
From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients. Ajay Jaiswal, Lu Yin, Zhenyu Zhang, Shiwei Liu, Jiawei Zhao, Yuandong Tian, Zhangyang Wang
Language:Python43 10 21
SeunghyunSEO/optimized_hf_llama_class_for_training
Language:Python422
aws-samples/Easy_Fintune_LLM_using_SageMaker_with_LLama_Factory
Language:Jupyter Notebook10 2 11
BUAADreamer/Qwen2-VL-History
Qwen2-VL在文旅领域的LLaMA-Factory微调案例 The case for fine-tuning Qwen2-VL in the field of historical literature and museums
21

hiyouga

hiyouga's Stars

pytorch/pytorch

karpathy/LLM101n

linkedin/Liger-Kernel

gpt-omni/mini-omni

QwenLM/Qwen2-VL

NVlabs/VILA

facebookresearch/chameleon

cambrian-mllm/cambrian

mistralai/cookbook

pytorch/data

kvcache-ai/Mooncake

facebookresearch/MobileLLM

multimodal-art-projection/MAP-NEO

pytorch/ao

rasbt/LLM-workshop-2024

NVlabs/DoRA

HazyResearch/m2

hubertsiuzdak/snac

feifeibear/long-context-attention

intel/auto-round

cognitivecomputations/grokadamw

sail-sg/sailor-llm

vwxyzjn/summarize_from_feedback_details

chujiezheng/LLM-Extrapolation

warner-benjamin/optimi

zhiyuanhubj/LongRecipe

VITA-Group/WeLore

SeunghyunSEO/optimized_hf_llama_class_for_training

aws-samples/Easy_Fintune_LLM_using_SageMaker_with_LLama_Factory

BUAADreamer/Qwen2-VL-History