ChangyuChen347

ChangyuChen347's Stars

EleutherAI/lm-evaluation-harness
A framework for few-shot evaluation of language models.
Language:Python6.9k 38 1.1k1.8k
OpenRLHF/OpenRLHF
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention)
Language:Python2.4k 21 274233
tatsu-lab/alpaca_eval
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
Language:Jupyter Notebook1.5k 8 143239
hendrycks/test
Measuring Massive Multitask Language Understanding | ICLR 2021
Language:Python1.2k 19 2090
princeton-nlp/SimPO
[NeurIPS 2024] SimPO: Simple Preference Optimization with a Reference-Free Reward
Language:Python696 8 7045
hsiehjackson/RULER
This repo contains the source code for RULER: What’s the Real Context Size of Your Long-Context Language Models?
Language:Python644 15 5743
lmarena/arena-hard-auto
Arena-Hard-Auto: An automatic LLM benchmark.
Language:Jupyter Notebook623 7 2771
corl-team/CORL
High-quality single-file implementations of SOTA Offline and Offline-to-Online RL algorithms: AWAC, BC, CQL, DT, EDAC, IQL, SAC-N, TD3+BC, LB-SAC, SPOT, Cal-QL, ReBRAC
Language:Python476 3 1020
RLHFlow/Online-RLHF
A recipe for online RLHF and online iterative DPO.
Language:Python405 18 2244
LiveBench/LiveBench
LiveBench: A Challenging, Contamination-Free LLM Benchmark
Language:Python276 8 6724
p-lambda/dsir
DSIR large-scale data selection framework for language model training
Language:Python227 21 719
Psycoy/MixEval
The official evaluation suite and dynamic data release for MixEval.
Language:Python222 1 2934
TIGER-AI-Lab/MAmmoTH2
Official code for "MAmmoTH2: Scaling Instructions from the Web" [NeurIPS 2024]
Language:Python123 3 99
openpsi-project/ReaLHF
Super-Efficient RLHF Training of LLMs with Parameter Reallocation
Language:Python108 3 174
Vance0124/Token-level-Direct-Preference-Optimization
Reference implementation for Token-level Direct Preference Optimization(TDPO)
Language:Python99 1 710
snu-mllab/EDAC
Official PyTorch implementation of "Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble" (NeurIPS'21)
Language:Python71 2 25
chujiezheng/LLM-Extrapolation
Official repository for paper "Weak-to-Strong Extrapolation Expedites Alignment"
Language:Python66 5 12
shenao-zhang/SELM
The official implementation of Self-Exploring Language Models (SELM)
Language:Python56 1 27
hamishivi/EasyLM
Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flax.
Language:Python54 1 013
haozheji/exact-optimization
ICML 2024 - Official Repository for EXO: Towards Efficient Exact Optimization of Language Model Alignment
Language:Python46 4 50
RUCAIBox/JiuZhang3.0
The code and data for the paper JiuZhang3.0
Language:Python35 0 21
thu-ml/Noise-Contrastive-Alignment
Code accompanying the paper "Noise Contrastive Alignment of Language Models with Explicit Rewards" (NeurIPS 2024)
Language:Python31 8 43
VinAIResearch/RecGPT
RecGPT: Generative Pre-training for Text-based Recommendation (ACL 2024)
Language:Python28 1 01
wzhouad/WPO
Code and models for EMNLP 2024 paper "WPO: Enhancing RLHF with Weighted Preference Optimization"
Language:Python280
thanhnguyentang/mmdrl
Official repo for our AAAI'21 paper, https://arxiv.org/abs/2007.12354
Language:Python25 5 07
ZhaolinGao/REBEL
Language:Python25 1 04
ars22/scaling-LLM-math-synthetic-data
Code and data used in the paper: "Training on Incorrect Synthetic Data via RL Scales LLM Math Reasoning Eight-Fold"
24 6 10
TsinghuaC3I/Intuitive-Fine-Tuning
Intuitive Fine-Tuning: Towards Simplifying Alignment into a Single Process
Language:Python220
alecwangcq/f-divergence-dpo
Direct preference optimization with f-divergences.
Language:Python11 1 13
morganf33/GNR
code for "Generative News Recommendation"
Language:Python10 1 12