WhisperT's Stars
deepseek-ai/DeepSeek-V3
deepseek-ai/DeepSeek-R1
hiyouga/LLaMA-Factory
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
huggingface/open-r1
Fully open reproduction of DeepSeek-R1
QwenLM/Qwen2.5
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
naklecha/llama3-from-scratch
llama3 implementation one matrix multiplication at a time
THUDM/ChatGLM3
ChatGLM3 series: Open Bilingual Chat LLMs | 开源双语对话语言模型
Jiayi-Pan/TinyZero
Clean, minimal, accessible reproduction of DeepSeek R1-Zero
InternLM/InternLM
Official release of InternLM series (InternLM, InternLM2, InternLM2.5, InternLM3).
xiaoyaDev/xiaoya-alist
小雅Alist的相关周边
modelscope/ms-swift
Use PEFT or Full-parameter to finetune 450+ LLMs (Qwen2.5, InternLM3, GLM4, Llama3.3, Mistral, Yi1.5, Baichuan2, DeepSeek-R1, ...) and 150+ MLLMs (Qwen2.5-VL, Qwen2-Audio, Llama3.2-Vision, Llava, InternVL2.5, MiniCPM-V-2.6, GLM4v, Xcomposer2.5, Yi-VL, DeepSeek-VL2, Phi3.5-Vision, GOT-OCR2, ...).
THUDM/GLM-4
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
deepseek-ai/DeepSeek-V2
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
wainshine/Chinese-Names-Corpus
中文人名语料库。人名生成器。中文姓名,姓氏,名字,称呼,日本人名,翻译人名,英文人名。可用于中文分词、人名实体识别。
modelscope/data-juicer
Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷
hkust-nlp/simpleRL-reason
This is a replicate of DeepSeek-R1-Zero and DeepSeek-R1 training on small models with limited data
CLUEbenchmark/SuperCLUE
SuperCLUE: 中文通用大模型综合性基准 | A Benchmark for Foundation Models in Chinese
XinJingHao/DRL-Pytorch
Clean, Robust, and Unified PyTorch implementation of popular Deep Reinforcement Learning (DRL) algorithms (Q-learning, Duel DDQN, PER, C51, Noisy DQN, PPO, DDPG, TD3, SAC, ASL)
PRIME-RL/PRIME
Scalable RL solution for advanced reasoning of language models
RLHFlow/RLHF-Reward-Modeling
Recipes to train reward model for RLHF.
SarvagyaVaish/FlappyBirdRL
Flappy Bird hack using Reinforcement Learning
NVIDIA/NeMo-Aligner
Scalable toolkit for efficient model alignment
IEIT-Yuan/Yuan-2.0
Yuan 2.0 Large Language Model
allenai/reward-bench
RewardBench: the first evaluation tool for reward models.
OpenBMB/Eurus
RLHF-V/RLAIF-V
[CVPR'25] RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness
apachecn/stanford-cs234-notes-zh
斯坦福 cs234 强化学习中文讲义
yuyq96/TextHawk
Exploring Efficient Fine-Grained Perception of Multimodal Large Language Models
RLHFlow/Directional-Preference-Alignment
Directional Preference Alignment
Tlntin/qwen-ascend-llm