Pinned Repositories
HALOs
A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).
red-instruct
Codes and datasets of the paper Red-Teaming Large Language Models using Chain of Utterances for Safety-Alignment
direct-preference-optimization
Reference implementation for DPO (Direct Preference Optimization)
alpaca_eval
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
DDPPyTorchLightningPruningCallback
Support TorchDistributedTrial in PyTorchLightningPruningCallback from Optuna
DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
optuna
A hyperparameter optimization framework
trl
Train transformer language models with reinforcement learning.
YJWon99's Repositories
YJWon99/DDPPyTorchLightningPruningCallback
Support TorchDistributedTrial in PyTorchLightningPruningCallback from Optuna
YJWon99/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
YJWon99/optuna
A hyperparameter optimization framework
YJWon99/trl
Train transformer language models with reinforcement learning.