anuragajay/decision-diffuser

Where is the code for history conditioning proposed in the paper?

Opened this issue · 7 comments

Decision Diffuser needs to maintain a history of length K and then condition on it, to constrain the plan to be consistent, which is described in the paper. However, I can not find corresponding codes to realize this trick. Thank you very much for answering this question!

Hi, I was just reading the codes, so have you found the length K history? I cant find it either.

Hi, I was just reading the codes, so have you found the length K history? I cant find it either.

I haven't found it yet. Still waiting for the authors' response.

I still have some confuse for the conditional generation of diffusion model.
image

I found that the config file in this repo used the "TemporalUnet" model as diffusion model. So I checked how they use this model to do the conditional generation of state in RL. But I found the param "cond"(I think this papam means condition parameters) of forward function in the "TemporalUnet" model haven't been used in the all code of the "TemporalUnet" model. So I think the author may not supply the right config file so the "TemporalUnet" model is not the conditional generation model. Or some other thing I haven't found. Nevertheless, waiting for the author's response too.

image

Best wishes!

Looomo commented

cond actually means s_t (current observation), which is already put in x[ : , 0, actiondim: ]. So yes, param cond is useless in Unet. See https://github.com/anuragajay/decision-diffuser/blob/01ce528c30b4733dc59aa6203e46ec165561158d/code/diffuser/models/diffusion.py#L260C69-L260C69 for details.

i think the history conditioning happens here -

x_noisy = apply_conditioning(x_noisy, cond, self.action_dim)

^this will call helper:

def apply_conditioning(x, conditions, action_dim):
    for t, val in conditions.items():
        x[:, t, action_dim:] = val.clone()
    return x

and the "conditions" that are passed to it are from the dataset parametrization you choose, which can be sequence dataset (condition on first obs):

def get_conditions(self, observations):

or the other classes which pass in different ones like longer history or goaldataset

i think the history conditioning happens here -

x_noisy = apply_conditioning(x_noisy, cond, self.action_dim)

^this will call helper:

def apply_conditioning(x, conditions, action_dim):
    for t, val in conditions.items():
        x[:, t, action_dim:] = val.clone()
    return x

and the "conditions" that are passed to it are from the dataset parametrization you choose, which can be sequence dataset (condition on first obs):

def get_conditions(self, observations):

or the other classes which pass in different ones like longer history or goaldataset

I believe this is inherited from its previous work "Planning with diffusion for flexible behavior synthesis". It is used for setting the the initial and final state for planning.

I personally don't think this is the way that the auther suggests to add constraints.

so ,Has anyone solved the problem