This repository contains the code of the AAF framework proposed in thishttps://arxiv.org/abs/2210.13923 paper. The main idea behind this work is to propose a flexible framework to implement various attention mechanisms for Few-Shot Object Detection. The framework is composed of 3 different modules: Spatial Alignment, Global Attention and Fusion Layer, which are applied successively to combine features from query and support images.
The inputs of the framework are:
- query_features
List[Tensor(B, C, H, W)]
: Query features at different levels. For each level, the features are of shape Batch x Channels x Height x Width. - support_features
List[Tensor(N, C, H', W')]
: Support features at different level. First dimension correspond to the number of support images, regrouped by class:N = N_WAY * K_SHOT
. - support_targets
List[BoxList]
bounding boxes for object in each support image.
The framework can be configured using a separate config file. Examples of such files are available under /config_files/aaf_framework/
. The structure of these files is simple:
ALIGN_FIRST: #True/False Run Alignment before Attention when True
OUT_CH: # Number of features output by the fusion layer
ALIGNMENT:
MODE: # Name of the alignment module selected
ATTENTION:
MODE: # Name of the attention module selected
FUSION:
MODE: # Name of the fusion module selected
File name | Method | Alignment | Attention | Fusion |
---|---|---|---|---|
identity.yaml |
Identity | IDENTITY | IDENTITY | IDENTITY |
feature_reweighting.yaml |
FSOD via feature reweighting | IDENTITY | REWEIGHTING_BATCH | IDENTITY |
meta_faster_rcnn.yaml |
Meta Faster-RCNN | SIMILARITY_ALIGN | META_FASTER | META_FASTER |
self_adapt.yaml |
Self-adaptive attention for FSOD | IDENTITY_NO_REPEAT | GRU | IDENTITY |
dynamic.yaml |
Dynamic relevance learning | IDENTITY | INTERPOLATE | DYNAMIC_R |
dana.yaml |
Dual Awarness Attention for FSOD | CISA | BGA | HADAMARD |
The path to the AAF config file should be specified inside the master config file (i.e. for the whole network) under FEWSHOT.AAF.CFG
.
For each module, classes implementing the available choices are regrouped under a single file: /modelling/aaf/alignment.py
, /modelling/aaf/attention.py
and /modelling/aaf/fusion.py
.
Spatial Alignment reorganizes spatially the features of one feature map to match another one. The idea is to align similar features in both maps so that comparison is easier.
Name | Description |
---|---|
IDENTITY | Repeats the feature to match BNCHW and NBCHW dimensions |
IDENTITY_NO_REPEAT | Identity without repetition |
SIMILARITY_ALIGN | Compute similarity matrix between support and query and align support to query accordingly. |
CISA | CISA block from this method |
### Global Attention Global Attention highlights some features of a map accordingly to an attention vector computed globally on another one. The idea is to leverage global and hopefully semantic information.
Name | Description |
---|---|
IDENTITY | Simply pass features to next modules. |
REWEIGHTING | Reweights query features using globally pooled vectors from support. |
REWEIGHTING_BATCH | Same as above but support examples are the same for the whole batch. |
SELF_ATTENTION | Same as above but attention vectors are computed from the alignment matrix between query and support. |
BGA | BGA blocks from this method |
META_FASTER | Attention block from this method |
POOLING | Pools query and support features to the same size. |
INTERPOLATE | Upsamples support features to match query size. |
GRU | Computes attention vectors through a graph representation using a GRU. |
Combine directly the features from support and query. These maps must be of the same dimension for point-wise operation. Hence fusion is often employed along with alignment.
Name | Description |
---|---|
IDENTITY | Returns onlu adapted query features. |
ADD | Point-wise sum between query and support features. |
HADAMARD | Point-wise multiplication between query and support features. |
SUBSTRACT | Point-wise substraction between query and support features. |
CONCAT | Channel concatenation of query and support features. |
META_FASTER | Fusion layer from this method |
DYNAMIC_R | Fusion layer from this method |
Training and evaluation scripts are available.
TODO: Give code snippet to run training with a specified config file (modify main) Basically create 2 scripts train.py and eval.py with arg config file.
Explain DataHandler
class a bit.
Dependencies used for this projects can be installed through conda create --name <env> --file requirements.txt
.
Please note that these requirements are not all necessary and it will be updated soon.
FCOS must be installed from sources. But there might be some issue after installation depending on the version of the python packages you use.
cpu/vision.h
file not found: replace all occurences in files underfcos_core/csrc/cpu/
byvision.h
(see this issue).- Error related to
AT_CHECK
with pytorch > 1.5 : replace all occurences byTORCH_CHECK
(see this issue. - Error related to
torch._six.PY36
: replace all occurence ofPY36
byPY37
or check this PR.
Results on pascal VOC, COCO and DOTA.