Where is Ltxt? How does the Mixed Latent Strategy involved in training?
lrzjason opened this issue · 3 comments
In Figure 4: Overview of CoMat. It has a loss call Ltxt. It involved a GT prompt and Text prompt.
In the bottom description, it becomes Li2t which only involved the Text prompt.
Before section 5, The formula combines all losses which also doesn't include Ltxt.
I double checked in code, it only generated one image which doesn't using the 'Noisy GT'.
And only once self.caption_model() to have call.
Does Mixed Latent Strategy was used in training at all?
Hi @lrzjason, the current version does not include the mixed latent strategy. We will update the codebase recently. Please stay tuned!
Thanks for pointing out the error in Fig. 4. In fact, the