pxyWaterMoon

Pinned Repositories

Lyricify-on-Wine
This is a repository for Lyricify releases that can run on Wine.
Language:Shell9 1 11
safe-rlhf
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
Language:Python1.3k 17 83119
HeatSupply
Language:Python00
ics-pa-gitbook
Language:HTML00
motion-diffusion-model
The official PyTorch implementation of the paper "Human Motion Diffusion Model"
Language:Python00
pxyWaterMoon.github.io
Language:HTML00
RL4LMs
A modular RL library to fine-tune language models to human preferences
Language:Python00
safe-rlhf
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
Language:Python00
safety-tuned-llamas
ICLR2024 Paper. Showing properties of safety tuning and exaggerated safety.
Language:Python59 2 76

pxyWaterMoon's Repositories

pxyWaterMoon/HeatSupply
Language:Python00
pxyWaterMoon/ics-pa-gitbook
Language:HTML00
pxyWaterMoon/motion-diffusion-model
The official PyTorch implementation of the paper "Human Motion Diffusion Model"
Language:Python00
pxyWaterMoon/pxyWaterMoon.github.io
Language:HTML00
pxyWaterMoon/RL4LMs
A modular RL library to fine-tune language models to human preferences
Language:Python00
pxyWaterMoon/safe-rlhf
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
Language:Python00