Dzz2004's Stars
Dzz2004/RL_algorithms
BIT强化学习课程的实验报告,汇集了常见的强化学习算法以及在cartpole环境下的实现。
tianyi-lab/Superfiltering
[ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning
hkust-nlp/deita
Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]
IronBeliever/CaR
Clustering and Ranking: Diversity-preserved Instruction Selection through Expert-aligned Quality Estimation