zhaoyl18

Ph.D. Researcher at Princeton

Princeton, NJ, U.S.

Pinned Repositories

AlignProp
AlignProp uses direct reward backpropogation for the alignment of large-scale text-to-image diffusion models. Our method is 25x more sample and compute efficient than reinforcement learning methods (PPO) for finetuning Stable Diffusion
Language:Python0 0 00
bandit_sim
Language:Python0 0 00
cos598d_pruning
Assignments for COS598D: System and Machine Learning
Language:Jupyter Notebook0 0 00
ddpo-jax
Code for the paper "Training Diffusion Models with Reinforcement Learning"
Language:Python0 0 00
ddpo-pytorch
DDPO for finetuning diffusion models, implemented in PyTorch with LoRA support
Language:Python0 0 00
Deep-PCA
Language:Python1 1 01
DiGress
code for the paper "DiGress: Discrete Denoising diffusion for graph generation"
Language:Python0 0 00
ratio_game
policy gradient methods for von Neumann's ratio game
Language:Python8 1 00
SEIKO
SEIKO is a novel reinforcement learning method to efficiently fine-tune diffusion models in an online setting. Our methods outperform all baselines (PPO, classifier-based guidance, direct reward backpropagation) for fine-tuning Stable Diffusion.
Language:Python17 3 30
zhaoyl18.github.io
Language:JavaScript2 1 00

zhaoyl18's Repositories

zhaoyl18/SEIKO
SEIKO is a novel reinforcement learning method to efficiently fine-tune diffusion models in an online setting. Our methods outperform all baselines (PPO, classifier-based guidance, direct reward backpropagation) for fine-tuning Stable Diffusion.
Language:Python17 3 30
zhaoyl18/ratio_game
policy gradient methods for von Neumann's ratio game
Language:Python8 1 00
zhaoyl18/zhaoyl18.github.io
Language:JavaScript2 1 00
zhaoyl18/Deep-PCA
Language:Python1 1 01
zhaoyl18/AlignProp
AlignProp uses direct reward backpropogation for the alignment of large-scale text-to-image diffusion models. Our method is 25x more sample and compute efficient than reinforcement learning methods (PPO) for finetuning Stable Diffusion
Language:Python0 0 00
zhaoyl18/bandit_sim
Language:Python0 0 00
zhaoyl18/cos598d_pruning
Assignments for COS598D: System and Machine Learning
Language:Jupyter Notebook0 0 00
zhaoyl18/ddpo-jax
Code for the paper "Training Diffusion Models with Reinforcement Learning"
Language:Python0 0 00
zhaoyl18/ddpo-pytorch
DDPO for finetuning diffusion models, implemented in PyTorch with LoRA support
Language:Python0 0 00
zhaoyl18/DiGress
code for the paper "DiGress: Discrete Denoising diffusion for graph generation"
Language:Python0 0 00
zhaoyl18/GDPO
Graph Diffusion Policy Optimization
Language:Python0 0 00
zhaoyl18/online_CDM
Language:Python0 1 01
zhaoyl18/gReLU
Language:Python1 0
zhaoyl18/mol_prop
Language:Jupyter Notebook1 0
zhaoyl18/MOOD
Official code repository for the paper Exploring Chemical Space with Score-based Out-of-distribution Generation (ICML 2023)
Language:Python0 0
zhaoyl18/RCGDM
Language:Python0 0
zhaoyl18/SVDD
Derivative-Free Guidance in Diffusion Models with Soft Value-Based Decoding. For controlled generation in DNA, RNA, proteins, molecules (+ images)
zhaoyl18/SVDD-image
Derivative-Free, Training-Free, Guidance in Diffusion Models
Language:Python0 0

zhaoyl18

Pinned Repositories

AlignProp

bandit_sim

cos598d_pruning

ddpo-jax

ddpo-pytorch

Deep-PCA

DiGress

ratio_game

SEIKO

zhaoyl18.github.io

zhaoyl18's Repositories

zhaoyl18/SEIKO

zhaoyl18/ratio_game

zhaoyl18/zhaoyl18.github.io

zhaoyl18/Deep-PCA

zhaoyl18/AlignProp

zhaoyl18/bandit_sim

zhaoyl18/cos598d_pruning

zhaoyl18/ddpo-jax

zhaoyl18/ddpo-pytorch

zhaoyl18/DiGress

zhaoyl18/GDPO

zhaoyl18/online_CDM

zhaoyl18/gReLU

zhaoyl18/mol_prop

zhaoyl18/MOOD

zhaoyl18/RCGDM

zhaoyl18/SVDD

zhaoyl18/SVDD-image