yang1fan2's Stars
labmlai/annotated_deep_learning_paper_implementations
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠
hpcaitech/ColossalAI
Making large AI models cheaper, faster and more accessible
microsoft/autogen
A programming framework for agentic AI 🤖 PyPi: autogen-agentchat Discord: https://aka.ms/autogen-discord Office Hour: https://aka.ms/autogen-officehour
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
karpathy/llm.c
LLM training in simple, raw C/CUDA
VikParuchuri/marker
Convert PDF to markdown + JSON quickly with high accuracy
QwenLM/Qwen2.5
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
ShishirPatil/gorilla
Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)
dair-ai/ML-Papers-of-the-Week
🔥Highlighting the top ML papers every week.
lucidrains/PaLM-rlhf-pytorch
Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM
hijkzzz/Awesome-LLM-Strawberry
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
pymupdf/PyMuPDF
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
OpenBMB/ToolBench
[ICLR'24 spotlight] An open platform for training, serving, and evaluating large language model for tool learning.
google-deepmind/alphageometry
suragnair/alpha-zero-general
A clean implementation based on AlphaZero for any game in any framework + tutorial + Othello/Gobang/TicTacToe/Connect4 and more
microsoft/LMOps
General technology for enabling AI capabilities w/ LLMs and MLLMs
OpenRLHF/OpenRLHF
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
PhoebusSi/Alpaca-CoT
We unified the interfaces of instruction-tuning data (e.g., CoT data), multiple LLMs and parameter-efficient methods (e.g., lora, p-tuning) together for easy use. We welcome open-source enthusiasts to initiate any meaningful PR on this repo and integrate as many LLM related technologies as possible. 我们打造了方便研究人员上手和使用大模型等微调平台,我们欢迎开源爱好者发起任何有意义的pr!
atfortes/Awesome-LLM-Reasoning
Reasoning in Large Language Models: Papers and Resources, including Chain-of-Thought and OpenAI o1 🍓
google-deepmind/code_contests
GAIR-NLP/O1-Journey
O1 Replication Journey: A Strategic Progress Report – Part I
nikhilbarhate99/PPO-PyTorch
Minimal implementation of clipped objective Proximal Policy Optimization (PPO) in PyTorch
google-research/FLAN
srush/awesome-o1
A bibliography and survey of the papers surrounding o1
volcengine/veScale
A PyTorch Native LLM Training Framework
volcengine/verl
veRL: Volcano Engine Reinforcement Learning for LLM
bigcode-project/bigcode-dataset
SpursGoZmy/Awesome-Tabular-LLMs
We collect papers about "large language models (LLM) for table-related tasks", e.g., using LLM for Table QA task. “表格+LLM”相关论文整理
MARIO-Math-Reasoning/Super_MARIO
FreedomIntelligence/ReasoningNLP
paper list on reasoning in NLP