No dynamic scenes were obtained after stage 3 training

Question

No dynamic scenes were obtained after stage 3 training

ByChelsea opened this issue 9 months ago · 6 comments

Hello, I used the prompt "a dog riding a skateboard“ provided in train.sh, but the result doesn't seem to be 4D; there's hardly any dynamic change. What could be the reason for this?

it70037-test.mp4

Answer 1 · 2024-02-10T12:55:48.000Z

Hi, is this with the low vram config? This can happen with this setup. Have you tried increasing system.loss.lambda_sds_video as in the readme?

Answer 2 · 2024-02-10T13:00:22.000Z

Yes, it is the low vram config. I didn't change the value of system.loss.lambda_sds_video; I kept it at 0.1. I'll try increasing it to see if it yields better results. Thanks!

Answer 3 · 2024-02-10T13:23:50.000Z

I also found the original implementation to give a bit better results: https://github.com/sherwinbahmani/4dfy

Increasing system.loss.lambda_sds_video towards 1.0 will increase motion significantly

Answer 4 · 2024-02-10T13:33:26.000Z

Thanks for the suggestion! I checked and realized that I'm using the original implementation. I'm sorry for the confusion between the two repos when submitting the issues...I'll be careful about this in the future. :)

BTW, it seems that low VRAM (24G) really does significantly reduce the quality...

Answer 5 · 2024-02-10T13:37:48.000Z

To improve the motion also a better 4D representation will help, like a deformation based representation. The current one with the 4D hash grids was more developed for high quality, but the motion is generally compensated by reducing ystem.loss.lambda_sds_video. So increasing ystem.loss.lambda_sds_video increases motion but also compensates the quality more. In the normal vram setting it seemed to be fine like this, for low vram it might need more hyperparameter tuning.

Answer 6 · 2024-02-10T13:46:55.000Z

Thank you very much for the share and suggestions, they are very helpful!