octo-models/octo

Initializing action heads in finetune_config file

Closed this issue · 4 comments

Thanks for the great model! I am trying to finetune a Franka Panda on a small dataset using Octo. I've had some success modifying the Aloha example with my own dataset. However, I noticed that the hyperparameters are different compared to the finetune_config provided when using the advanced finetuning approach (which seems to be using the params from the paper). However, since my action space is different, I'd like to reinitialize the action head but don't see such an option in the config file. Would someone be able to suggest a way to include this in the config? Thanks!

If you change the action dim in the action head the corresponding new params would automatically get initialized from scratch (eg the output layer of the action head).
You can check the function that does the param copying from pre-trained checkpoint to finetuning init here: https://github.com/octo-models/octo/blob/main/octo/utils/train_utils.py#L382

If you want to re-init the whole action head, not just the params that change shape, you can either change the name of the action head or hack the function above to include a "skip_keys" argument that manually excludes keys that match a certain pattern from the keys_to_update, similar to how we implement frozen_keys in the optimizer:

def freeze_weights(

Thanks Karl! On a slightly different note, I had a question about learning resets. My training trajectories currently include a pick, place, and then a retreat (slightly different each time since it comes from a VR controller). Is this a reasonable way to learn the end of an episode?

If you train with goal image conditioning no problem. If you train language conditioned it may be hard for the model to guess which retreat-trajectory it should predict, but it should probably still work fine if it's not required to be very precise in the retreat trajectory, so I would just go ahead and try it with your current setup.

Excellent! I'm using goal image conditioning. Thanks!