Where is Ltxt? How does the Mixed Latent Strategy involved in training?

Question

Where is Ltxt? How does the Mixed Latent Strategy involved in training?

lrzjason opened this issue 4 months ago · 3 comments

lrzjason commented 4 months ago

In Figure 4: Overview of CoMat. It has a loss call Ltxt. It involved a GT prompt and Text prompt.

In the bottom description, it becomes Li2t which only involved the Text prompt.

Before section 5, The formula combines all losses which also doesn't include Ltxt.

I double checked in code, it only generated one image which doesn't using the 'Noisy GT'.
$KL)5{_2I_0NVD)W{KGYA0PB$

And only once self.caption_model() to have call.

Does Mixed Latent Strategy was used in training at all?

Answer 1 · 2024-08-09T02:41:57.000Z

Hi @lrzjason, the current version does not include the mixed latent strategy. We will update the codebase recently. Please stay tuned!

Answer 2 · 2024-08-09T02:46:42.000Z

Thanks for pointing out the error in Fig. 4. In fact, the $\mathcal{L}_{txt}$ in the figure should be $\mathcal{L}_{i2t}$. $\mathcal{L}_{i2t}$ involves both GT prompt and Text prompt.

Answer 3 · 2024-08-09T08:33:56.000Z

@CaraJ7 Thanks for the reply.