DINO v1 repository modified for use with crops from MeerKAT continuum images from the MGCLS survey.
for the paper Self-Supervised Learning on MeerKAT Wide-Field Continuum Images
Public data can be downloaded from the MGCLS data release
- main_dino.py -> main_meerkat.py
Minor edits in utils.py to accept single-channel images as input
Data preparation
- mgcls_data_prep.py
Training with custom dataset
- MeerKATDataset.py
- transforms.py
Feature extraction
- eval_knn_train.py
Evaluation based on extracted features
- evaluator.py
- eval.yaml
- evaluation_models.py
- EvaluationDatasets.py
To perform the evaluation as in the paper, it is required to install the following repositories:
To generate COCO-format labels for compact sources, the following repository is required:
Clone or fork the repository.
Data preparation
python mgcls_data_prep.py --data_path /path/to/FTIS/files
Training
python main_meerkat.py --data_path $train --output_dir $output_dir --arch $arch --patch_size $patch_size --epochs $epochs --saveckp_freq $savefrq --num_workers 0 --batch_size_per_gpu $batch_size_per_gpu --use_fp16 false --momentum_teacher 0.996 --augmentations rotate powerlaw --weight_decay $weight_decay --weight_decay_end $weight_decay_end --lr $lr --in_chans 1 --project $project --checkpoint_name $checkpoint_name
if using more than one gpu and SLURM:
srun python -m torch.distributed.launch --nproc_per_node=4 main_meerkat.py --data_path $train --output_dir $output_dir --arch $arch --patch_size $patch_size --epochs $epochs --saveckp_freq $savefrq --num_workers 0 --batch_size_per_gpu $batch_size_per_gpu --use_fp16 false --momentum_teacher 0.996 --augmentations rotate powerlaw --weight_decay $weight_decay --weight_decay_end $weight_decay_end --lr $lr --in_chans 1 --project $project --checkpoint_name $checkpoint_name
Evaluation
Generate labels via pyBDSF_to_COCO and crop_catalog_aggs(). Otherwise modify evaluator.py to accept the names of custom labels.
Extract features:
python eval_knn_train.py --data_path $data_path --dump_features $output_dir --arch $arch --patch_size $patch_size --num_workers 0 --in_chans $in_chans --pretrained_weights $output_dir/$checkpoint_name
Modify config.yaml to point to the desired inputs, labels, and output destination.
python evaluator.py --config config.yaml
Attention maps
python visualize_attention.py --image_path prepared_image.npy --image_size 256 256 --arch $arch --output_dir $output_dir --patch_size $patch_size --pretrained_weights $output_dir/$checkpoint_name --in_chans 1
Checkpoints are available for download here.
Architecture | Weight initialization | Pre-training Epochs | Full Checkpoint | Teacher Checkpoint |
---|---|---|---|---|
ViT-S8 | Random | 325 | mgcls_vits8_pretrain_325.pth | mgcls_vits8_pretrain_325_teacher.pth |
ViT-B16 | DINOv1 | 25 | mgcls_vitb16_pretrain_025.pth | mgcls_vitb16_pretrain_025_teacher.pth |
ResNet50 | Random | 425 | mgcls_resnet50_pretrain_425.pth | mgcls_resnet50_pretrain_425_teacher.pth |
Use the weights for the teacher network for feature extraction/finetuning
see docstrings