Pinned Repositories
rlfh-gen-div
This is code for most of the experiments in the paper Understanding the Effects of RLHF on LLM Generalisation and Diversity
trl
Train transformer language models with reinforcement learning.
alpaca_eval
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
Awesome-Mixture-of-Experts-Papers
A curated reading list of research in Mixture-of-Experts(MoE).
kNeuron-Tuning
Reward-Calibration
TRAMA
Source code for TALLIP paper Token Relation Aware Chinese Named Entity Recognition
Transformer-Patcher
ZeroYuHuang.github.io
RMB-Reward-Model-Benchmark
ZeroYuHuang's Repositories
ZeroYuHuang/Transformer-Patcher
ZeroYuHuang/Reward-Calibration
ZeroYuHuang/Awesome-Mixture-of-Experts-Papers
A curated reading list of research in Mixture-of-Experts(MoE).
ZeroYuHuang/kNeuron-Tuning
ZeroYuHuang/TRAMA
Source code for TALLIP paper Token Relation Aware Chinese Named Entity Recognition
ZeroYuHuang/ZeroYuHuang.github.io