wl-zhao/VPD

Depth Estimation with VPD

cvrookieytd opened this issue · 6 comments

Has anyone succeeded in replicating it? For convenient communication, you can leave a message

I can't replicate the depth estimation result, maybe my batch size is too small that only 2. rms=0.264(vpd=0.254)

Hi,
In our experiments, the global batch size is 24. Maybe you can try a larger batch size.

Is it normal to get the loss as nan for the first few 100 batches of epoch 1?

Also I downloaded the missing v1-5-pruned-emaonly.ckpt from this link https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/v1-5-pruned-emaonly.ckpt. This gave me the following warning which I chose to ignore:
Restored from ../checkpoints/v1-5-pruned-emaonly.ckpt with 446 missing and 199 unexpected keys

Could this be the reason for the nan loss values?

I can't replicate the depth estimation result, maybe my batch size is too small that only 2. rms=0.264(vpd=0.254)

In inference.yaml,
cond_stage_config:
target: ldm.modules.encoders.modules.FrozenCLIPEmbedder
But in the LDM. Modules. The encoders. Modules
How did you solve the FrozenCLIPEmbedder function?

This issue is closed since there has been no discussion for weeks. If there are any questions, please open a new issue.