- Original Paper
- Original github repository
There are four parts in the code.
- model: It contains the main files for DUET network.
- data: It contains the data splits for different datasets.
- cache: It contains some cache files.
- script: The training scripts for DUET.
DUET
├── AwA2
├── cache
│ ├── AWA2
│ │ ├── attributeindex2prompt.json
│ │ └── id2imagepixel.pkl
│ ├── CUB
│ │ ├── attributeindex2prompt.json
│ │ ├── id2imagepixel.pkl
│ │ └── mapping.json
│ └── SUN
│ │ ├── attributeindex2prompt.json
│ │ ├── id2imagepixel.pkl
│ │ └── mapping.json
├── CUB_200_2011
├── data
│ ├── AWA2
│ │ ├── APN.mat
│ │ ├── TransE_65000.mat
│ │ ├── att_splits.mat
│ │ ├── attri_groups_9.json
│ │ ├── kge_CH_AH_CA_60000.mat
│ │ └── res101.mat
│ ├── CUB
│ │ ├── APN.mat
│ │ ├── att_splits.mat
│ │ ├── attri_groups_8.json
│ │ ├── attri_groups_8_layer.json
│ │ └── res101.mat
│ └── SUN
│ ├── APN.mat
│ ├── att_splits.mat
│ ├── attri_groups_4.json
│ └── res101.mat
├── log
│ ├── AWA2
│ ├── CUB
│ └── SUN
├── model
│ ├── log.py
│ ├── main.py
│ ├── main_utils.py
│ ├── model_proto.py
│ ├── modeling_lxmert.py
│ ├── opt.py
│ ├── swin_modeling_bert.py
│ ├── util.py
│ └── visual_utils.py
├── PLMs
├── out
│ ├── AWA2
│ ├── CUB
│ └── SUN
└── script
├── AWA2
│ └── AWA2_GZSL.sh
├── CUB
│ └── CUB_GZSL.sh
└── SUN
└── SUN_GZSL.sh
├── SUN
Python 3
PyTorch >= 1.8.0
Transformers>= 4.11.3
NumPy
- All experiments are performed with one RTX 3090Ti GPU.
- run following script to create conda virtual envonriment
conda env create -f cm.yaml
- Dataset:
- original dataset: CUB, AWA2, SUN, and change the
opt.image_root
to the dataset root path on your machine - For other required feature files:
APN.mat
indata/
(https://github.com/wenjiaXu/APN-ZSL) andid2imagepixel.pkl
incache/
(https://pan.baidu.com/s/13oyLDNm6uoYpVgcMitrY-A, [Baidu cloud
,19.89G
, Code:s07d
])
- original dataset: CUB, AWA2, SUN, and change the
- Data split:
- please download the data folder and place it in
./data/
. Attributeindex2prompt.json
should generate and place it in./cache/dataset/
.
- please download the data folder and place it in
- Retrained vision Transformer as the vision encoder
- Pretrained Language Model
The training script for AWA2_GZSL:
bash script/AWA2/AWA2_GZSL.sh
[--dataset {AWA2, SUN, CUB}] [--calibrated_stacking CALIBRATED_STACKING] [--nepoch NEPOCH] [--batch_size BATCH_SIZE] [--manualSeed MANUAL_SEED]
[--classifier_lr LEARNING-RATE] [--xe XE] [--attri ATTRI] [--gzsl] [--patient PATIENT] [--model_name MODEL_NAME] [--mask_pro MASK-PRO]
[--mask_loss_xishu MASK_LOSS_XISHU] [--xlayer_num XLAYER_NUM] [--construct_loss_weight CONSTRUCT_LOSS_WEIGHT] [--sc_loss SC_LOSS] [--mask_way MASK_WAY]
[--attribute_miss ATTRIBUTE_MISS]
[--prefer_high]
[--langM_path]
[--langMtokenizer_path]
[--visualM_deit_path]
[--visualM_swin_path]