Our paper POAR: Towards Open Vocabulary Pedestrian Attribute Recognition
A PyTorch Lightning solution to train pedestrian open attribute recognition models
pip install 'git+https://github.com/katsura-jp/pytorch-cosine-annealing-with-warmup'
pip install pytorch-lightning
python train_finetune.py --folder data_dir --batch_size 512
CUDA_VISIBLE_DEVICES=0 python test_UDA_T2I.py --testset PA100K --trainset PETA --folder ./test/dataset/PETA/
As long as each of the image pairs have the same stem name (i.e. img1.png
and img1.txt
) all that you need to do is specify the folder on runtime. Any subfolder structure will be ignored, meaning foo/bar/image1.jpg
will always find its myster/folder/image1.txt
so long as they share a common parent folder. All image suffixes will work, the only expectation is that captions are separated by \n
.
Our aim is to create an easy to use Lightning implementation of pedestrian open attribute recognition. We will live by:
@inproceedings{Yue2023poar,
title={POAR: Towards Open Vocabulary Pedestrian Attribute Recognition},
author={Yue Zhang, Suchen Wang, Shichao Kan, Zhenyu Weng, Yigang Cen, and Yappeng Tan},
booktitle={In Proceedings of the 31st ACM International Conference on Multimedia (MM β23), October 29βNovember 3, Ottawa, ON, Canada.},
year={2023},
doi={10.1145/3581783.3611719}
}
- Get OpenAI's model creation script
- Create model inits
- ResNet50
- ResNet50x4
- ResNet101
- ViT-B/32
- all models
- Create model wrapper
- Create lightning trainer
- Create dataset files
- Performance boosts
- Mixed-precision
- Self-distillation
- Gradient checkpointing
- Half-precision Adam statistics
- Half-precision stochastically rounded text encoder weights