zhliu0106

Ph.D. candidate in Soochow University, China.

China

zhliu0106's Stars

openai/swarm
Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.
Language:Python17.5k 286 111.8k
naklecha/llama3-from-scratch
llama3 implementation one matrix multiplication at a time
Language:Jupyter Notebook14k 99 181.1k
liguodongiot/llm-action
本项目旨在分享大模型相关技术原理以及实战经验（大模型工程化、大模型应用落地）
Language:HTML12.6k 104 241.4k
datawhalechina/easy-rl
强化学习中文教程（蘑菇书🍄），在线阅读地址：https://datawhalechina.github.io/easy-rl/
Language:Jupyter Notebook9.9k 80 1481.9k
vwxyzjn/cleanrl
High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
Language:Python6k 37 186681
facebookresearch/lingua
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
Language:Python4.4k 28 40226
nerfies/nerfies.github.io
Language:JavaScript2.8k 38 5995
atfortes/Awesome-LLM-Reasoning
Reasoning in Large Language Models: Papers and Resources, including Chain-of-Thought and OpenAI o1 🍓
2.2k 40 3126
openai/simple-evals
Language:Python2.1k 28 15185
zjunlp/EasyEdit
[ACL 2024] An Easy-to-use Knowledge Editing Framework for LLMs.
Language:Jupyter Notebook2k 24 375244
GAIR-NLP/O1-Journey
O1 Replication Journey: A Strategic Progress Report – Part I
1.8k 35 1556
XinJingHao/DRL-Pytorch
Clean, Robust, and Unified PyTorch implementation of popular Deep Reinforcement Learning (DRL) algorithms (Q-learning, Duel DDQN, PER, C51, Noisy DQN, PPO, DDPG, TD3, SAC, ASL)
Language:Python1.7k 10 9210
hrishioa/lumentis
AI powered one-click comprehensive docs from transcripts and text.
Language:TypeScript1.6k 10 20100
RLHFlow/RLHF-Reward-Modeling
Recipes to train reward model for RLHF.
Language:Python1.1k 21 3276
ericyangyu/PPO-for-Beginners
A simple and well styled PPO implementation. Based on my Medium series: https://medium.com/@eyyu/coding-ppo-from-scratch-with-pytorch-part-1-4-613dfc1b14c8.
Language:Python827 12 9120
vwxyzjn/ppo-implementation-details
The source code for the blog post The 37 Implementation Details of Proximal Policy Optimization
Language:Python669 3 6100
hemingkx/SpeculativeDecodingPapers
📰 Must-read papers and blogs on Speculative Decoding ⚡️
545 26 426
chrisliu298/awesome-llm-unlearning
A resource repository for machine unlearning in large language models
265 6 214
jlko/semantic_uncertainty
Codebase for reproducing the experiments of the semantic uncertainty paper (short-phrase and sentence-length experiments).
Language:Python265 3 1025
isXinLiu/Awesome-MLLM-Safety
Accepted by IJCAI-24 Survey Track
Language:Python180 8 04
GraySwanAI/nanoGCG
A fast + lightweight implementation of the GCG algorithm in PyTorch
Language:Python152 2 1237
ZubinGou/math-evaluation-harness
A simple toolkit for benchmarking LLMs on mathematical reasoning tasks. 🧮✨
Language:Python141 2 411
OpenSafetyLab/SALAD-BENCH
【ACL 2024】 SALAD benchmark & MD-Judge
Language:Python114 4 211
NumberChiffre/mcts-llm
Language:Jupyter Notebook83 4 31
renqibing/ActorAttack
Language:Python61 1 21
rishub-tamirisa/tamper-resistance
Official Repository for "Tamper-Resistant Safeguards for Open-Weight LLMs"
Language:Python42 1 105
zjysteven/mink-plus-plus
Min-K%++: Improved baseline for detecting pre-training data of LLMs https://arxiv.org/abs/2404.02936
Language:Python28 2 75
idanshen/Value-Augmented-Sampling
Language:Python18 2 32
zhliu0106/probing-lm-data
Official Implementation of "Probing Language Models for Pre-training Data Detection"
Language:Python17 1 02
zhliu0106/learning-to-refuse
Official Implementation of "Learning to Refuse: Towards Mitigating Privacy Risks in LLMs"
Language:Python8 1 44

zhliu0106

zhliu0106's Stars

openai/swarm

naklecha/llama3-from-scratch

liguodongiot/llm-action

datawhalechina/easy-rl

vwxyzjn/cleanrl

facebookresearch/lingua

nerfies/nerfies.github.io

atfortes/Awesome-LLM-Reasoning

openai/simple-evals

zjunlp/EasyEdit

GAIR-NLP/O1-Journey

XinJingHao/DRL-Pytorch

hrishioa/lumentis

RLHFlow/RLHF-Reward-Modeling

ericyangyu/PPO-for-Beginners

vwxyzjn/ppo-implementation-details

hemingkx/SpeculativeDecodingPapers

chrisliu298/awesome-llm-unlearning

jlko/semantic_uncertainty

isXinLiu/Awesome-MLLM-Safety

GraySwanAI/nanoGCG

ZubinGou/math-evaluation-harness

OpenSafetyLab/SALAD-BENCH

NumberChiffre/mcts-llm

renqibing/ActorAttack

rishub-tamirisa/tamper-resistance

zjysteven/mink-plus-plus

idanshen/Value-Augmented-Sampling

zhliu0106/probing-lm-data

zhliu0106/learning-to-refuse