thwu1

EECS PhD @ Berkeley

Pinned Repositories

bairblog.github.io
Language:JavaScript8 15 036
Arxiv-Recommender
Language:Python39 1 21
SPAG
Self-playing Adversarial Language Game Enhances LLM Reasoning
Language:Python59 3 45
alpha-zero-general
A clean implementation based on AlphaZero for any game in any framework + tutorial + Othello/Gobang/TicTacToe/Connect4 and more
Language:Jupyter Notebook00
llm2vec
Language:Python00
pairwise-proximal-policy-optimization
Language:Python30
reward_exp
Language:Python00
spag
Self-playing Adversarial Language Game Enhances LLM Reasoning
Language:Python00
trlx
A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
Language:Python2 0 03
visual-tool
Language:Python10

thwu1's Repositories

thwu1/pairwise-proximal-policy-optimization
Language:Python30
thwu1/trlx
A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
Language:Python2 0 03
thwu1/visual-tool
Language:Python10
thwu1/alpha-zero-general
A clean implementation based on AlphaZero for any game in any framework + tutorial + Othello/Gobang/TicTacToe/Connect4 and more
Language:Jupyter Notebook00
thwu1/llm2vec
Language:Python00
thwu1/reward_exp
Language:Python00
thwu1/spag
Self-playing Adversarial Language Game Enhances LLM Reasoning
Language:Python00