google-research/slot-attention-video

Could you please provide model config file when using segmentation as conditioning?

YRlin-12 opened this issue · 1 comments

What is the initializer config when using segmentation as conditioning?

tkipf commented

The following changes to the bbox-conditional configs (savi_conditional_small/medium.py) should allow you to train with segmentation mask conditioning.

You need to update the conditioning key as follows:

config.conditioning_key = "segmentations"

You can replace the initializer config as follows:

# Initializer.
"initializer": ml_collections.ConfigDict({
    "module": "slot_attention_video.modules.SegmentationEncoderStateInit",
    "max_num_slots": 24,
    "zero_background": True,
    "reduction": "spatial_average",
    "backbone": ml_collections.ConfigDict({
        "module": "slot_attention_video.modules.CNN",
        "features": [32, 32, 32, 32],
        "kernel_size": [(5, 5), (5, 5), (5, 5), (5, 5)],
        "strides": [(2, 2), (2, 2), (2, 2), (1, 1)],
        "layer_transpose": [False, False, False, False]
    }),
    "pos_emb": ml_collections.ConfigDict({
        "module": "slot_attention_video.modules.PositionEmbedding",
        "embedding_type": "linear",
        "update_type": "project_add",
        "output_transform": ml_collections.ConfigDict({
            "module": "slot_attention_video.modules.MLP",
            "hidden_size": 64,
            "layernorm": "pre"
        }),
    }),
    "output_transform": ml_collections.ConfigDict({
        "module": "slot_attention_video.modules.MLP",
        "hidden_size": 256,
        "layernorm": "pre",
        "output_size": 128,
    }),
}),