Pinned Repositories
Lyricify-on-Wine
This is a repository for Lyricify releases that can run on Wine.
safe-rlhf
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
HeatSupply
ics-pa-gitbook
motion-diffusion-model
The official PyTorch implementation of the paper "Human Motion Diffusion Model"
pxyWaterMoon.github.io
RL4LMs
A modular RL library to fine-tune language models to human preferences
safe-rlhf
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
safety-tuned-llamas
ICLR2024 Paper. Showing properties of safety tuning and exaggerated safety.
pxyWaterMoon's Repositories
pxyWaterMoon/HeatSupply
pxyWaterMoon/ics-pa-gitbook
pxyWaterMoon/motion-diffusion-model
The official PyTorch implementation of the paper "Human Motion Diffusion Model"
pxyWaterMoon/pxyWaterMoon.github.io
pxyWaterMoon/RL4LMs
A modular RL library to fine-tune language models to human preferences
pxyWaterMoon/safe-rlhf
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback