joeljang/RLPHF

Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging

Python