kvablack/ddpo-pytorch

DDPO for finetuning diffusion models, implemented in PyTorch with LoRA support

PythonMIT

Issues

In train.py, the sample order between sample["advantages"] and sample["timesteps", "latents", "next_latents", "log_probs"] does not match.
#29 opened 4 months ago by YangSun22
0
Support for EulerAncestralDiscreteScheduler
#28 opened 4 months ago by anonymous-atom
2
Questions about the reward curve and bert.
#23 opened a year ago by zjuAIHz
2
Does this training process apply to latest SD, such as stabilityai/stable-diffusion-3-medium
#27 opened 6 months ago by roywang021
0
How to save fine-tuned model properly
#26 opened 6 months ago by shashankg7
0
training-code
#25 opened 7 months ago by ParnyanAtaei
0
OOM when using "stabilityai/stable-diffusion-2-1" with batch size of 2
#24 opened a year ago by xilongzhou
1
Finetuning on google colab
#22 opened a year ago by alirezanobakht13
0
Batch size unrecogonized
#20 opened a year ago by mao-code
1
SDXL Support?
#17 opened a year ago by rdcoder33
1
Code logics, thanks
#19 opened a year ago by junyongyou
0
unet keeps producing nan during training
#18 opened a year ago by EYcab
2
Hello, when I trained an aesthetic model using the default configuration on 8 A800 cards, I found that the training process got stuck after completing one epoch, but it worked fine when using a single A800 card. May I ask what could be the cause of this situation?
#13 opened 2 years ago by cjt222
8
About the training with prompt_image_alignment configuration which uses llava_bertscore reward function
#11 opened 2 years ago by QZJ-2003
5
prompt-dependent value function optimization
#15 opened 2 years ago by hkunzhe
0
About the large dataset and Unet Training
#7 opened 2 years ago by liuyuemaicha
3
reproducing the aesthetic experiment
#3 opened 2 years ago by seashell123
7
Support for other schedulers
#14 opened 2 years ago by desaixie
1
Question about the optimized objective.
#12 opened 2 years ago by JacobYuan7
1
Suggest to use larger gradient accumulation steps instead of multi GPUs
#10 opened 2 years ago by hkunzhe
3
Prompt Alignment with LLaVA-server: Client-side prompt and image doesn't match server side reward
#6 opened 2 years ago by desaixie
4
fp16 only if using lora?
#8 opened 2 years ago by GiilDe
2
On reproducing LLaVA alignment experiments.
#5 opened 2 years ago by bhattg
2
OOM despite using A100-80GB GPU and 128GB CPU memory (+16 CPUs per task)
#4 opened 2 years ago by noshaba
2
Gif visualization
#1 opened 2 years ago by SnowdenLee
1
On reproducibility and LoRA
#2 opened 2 years ago by bhattg
1