/UniAP

[AAAI 2024] UniAP: Towards Universal Animal Perception in Vision via Few-shot Learning

Primary LanguagePythonMIT LicenseMIT

UniAP

UniAP: Towards Universal Animal Perception in Vision via Few-shot Learning
Meiqi Sun*, Zhonghan Zhao*, Wenhao Chai*, Hanjun Luo, Shidong Cao, Yanting Zhang, Jenq-Neng Hwang, Gaoang Wang
AAAI 2024

We introduce UniAP, a novel Universal Animal Perception model that leverages few-shot learning to enable cross-species perception among various visual tasks.

🔥 News

  • [2023.12.10]: 🎉 Our paper is accepted by AAAI 2024.
  • [2023.08.20] : We release our code.
  • [2023.08.19] 📃 We release the paper.

If you like our project, please give us a star ⭐ on GitHub for the latest update.

Setup

  1. Download Datasets
<Root>
|--<AnimalKingdom>
|   |--<animal1>_<rgb>
|   | ...
|   |--<animal2>_<label>
|   |...
|
|--<APT-36K>
|   |--<animal1>_<rgb>
|   | ...
|   |--<animal2>_<label>
|   |...
|
|--<AnimalPose>
|   |--<animal1>_<rgb>
|   | ...
|   |--<animal2>_<label>
|   |...
|
|--<Oxford-IIITPet>
|   |--<animal1>_<rgb>
|   | ...
|   |--<animal2>_<label>
|   |...
|
|...
  1. Create data_paths.yaml file and write the root directory path (<Root> in the above structure) by UniASET: PATH_TO_YOUR_UniASET.

  2. Install pre-requirements by pip install -r requirements.txt.

  3. Create model/pretrained_checkpoints directory and download BEiT pre-trained checkpoints to the directory.

Usage

Training

python main.py --stage 0 --task_id [0/1/2/3]
  • If you want to train universally on all tasks, please set task_id=3.
  • If you want to train on the specific task, please follow task_id=0: pose estimation, task_id=1: semantic segmentation, task_id=2: classification.

Fine-tuning

python main.py --stage 1 --task [kp/mask/cls]
  • If you want to finetune on the specific task, please follow task=kp: pose estimation, task=mask: semantic segmentation, task=cls: classification.

Evaluation

python main.py --stage 2 --task [kp/mask/cls]
  • If you want to evaluate on the specific task, please follow task=kp: pose estimation, task=mask: semantic segmentation, task=cls: classification.

Acknowledgements

Our code refers the following repositores:

Citation

If you find STEVE useful for your your research and applications, please cite using this BibTeX:

@article{sun2023uniap,
  title={UniAP: Towards Universal Animal Perception in Vision via Few-shot Learning},
  author={Sun, Meiqi and Zhao, Zhonghan and Chai, Wenhao and Luo, Hanjun and Cao, Shidong and Zhang, Yanting and Hwang, Jenq-Neng and Wang, Gaoang},
  journal={arXiv preprint arXiv:2308.09953},
  year={2023}
}