Question about DDIM loss

Question

Question about DDIM loss

Opened this issue a year ago · 2 comments

Hi, thanks for your wonderful work!
When I was reading the code, I noticed that you took the time embedding on the feature extracted from the RGB images. I am wondering if it is better to take the time embedding on the depth output by the decoder (namely 'refined_depth' defined in your code), or just annotated depth with masks.
Thanks for your work and codes again!

Answer 1 · 2023-08-03T00:51:25.000Z

Hi, thanks for your question.
At that time our thinking is more like to add the time embedding on a dense and consistent feature. This way is closer to the original diffusion model. I haven't try to put time embedding directly on depth map.
Have you got any attempts on that? If the results are positive, I'm keen to have a improved version with you.

Answer 2 · 2023-09-22T01:48:12.000Z

Hi, thanks for your question. At that time our thinking is more like to add the time embedding on a dense and consistent feature. This way is closer to the original diffusion model. I haven't try to put time embedding directly on depth map. Have you got any attempts on that? If the results are positive, I'm keen to have a improved version with you.

Hi ! Thanks to your nice job.
I notive that you choose to predict the x0 instead of noise like DDPM.
Can you share the reason with me?
Thanks again~