zhliu0106's Stars
openai/swarm
Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.
naklecha/llama3-from-scratch
llama3 implementation one matrix multiplication at a time
liguodongiot/llm-action
本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)
datawhalechina/easy-rl
强化学习中文教程(蘑菇书🍄),在线阅读地址:https://datawhalechina.github.io/easy-rl/
vwxyzjn/cleanrl
High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
facebookresearch/lingua
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
nerfies/nerfies.github.io
atfortes/Awesome-LLM-Reasoning
Reasoning in Large Language Models: Papers and Resources, including Chain-of-Thought and OpenAI o1 🍓
openai/simple-evals
zjunlp/EasyEdit
[ACL 2024] An Easy-to-use Knowledge Editing Framework for LLMs.
GAIR-NLP/O1-Journey
O1 Replication Journey: A Strategic Progress Report – Part I
XinJingHao/DRL-Pytorch
Clean, Robust, and Unified PyTorch implementation of popular Deep Reinforcement Learning (DRL) algorithms (Q-learning, Duel DDQN, PER, C51, Noisy DQN, PPO, DDPG, TD3, SAC, ASL)
hrishioa/lumentis
AI powered one-click comprehensive docs from transcripts and text.
RLHFlow/RLHF-Reward-Modeling
Recipes to train reward model for RLHF.
ericyangyu/PPO-for-Beginners
A simple and well styled PPO implementation. Based on my Medium series: https://medium.com/@eyyu/coding-ppo-from-scratch-with-pytorch-part-1-4-613dfc1b14c8.
vwxyzjn/ppo-implementation-details
The source code for the blog post The 37 Implementation Details of Proximal Policy Optimization
hemingkx/SpeculativeDecodingPapers
📰 Must-read papers and blogs on Speculative Decoding ⚡️
chrisliu298/awesome-llm-unlearning
A resource repository for machine unlearning in large language models
jlko/semantic_uncertainty
Codebase for reproducing the experiments of the semantic uncertainty paper (short-phrase and sentence-length experiments).
isXinLiu/Awesome-MLLM-Safety
Accepted by IJCAI-24 Survey Track
GraySwanAI/nanoGCG
A fast + lightweight implementation of the GCG algorithm in PyTorch
ZubinGou/math-evaluation-harness
A simple toolkit for benchmarking LLMs on mathematical reasoning tasks. 🧮✨
OpenSafetyLab/SALAD-BENCH
【ACL 2024】 SALAD benchmark & MD-Judge
NumberChiffre/mcts-llm
renqibing/ActorAttack
rishub-tamirisa/tamper-resistance
Official Repository for "Tamper-Resistant Safeguards for Open-Weight LLMs"
zjysteven/mink-plus-plus
Min-K%++: Improved baseline for detecting pre-training data of LLMs https://arxiv.org/abs/2404.02936
idanshen/Value-Augmented-Sampling
zhliu0106/probing-lm-data
Official Implementation of "Probing Language Models for Pre-training Data Detection"
zhliu0106/learning-to-refuse
Official Implementation of "Learning to Refuse: Towards Mitigating Privacy Risks in LLMs"