CvT-ASSD: Convolutional vision-Transformerbased Attentive Single Shot MultiBox Detector (ICTAI 2021 CCF-C 会议)The 33rd IEEE International Conference on Tools with Artificial Intelligence

Primary LanguagePythonApache License 2.0Apache-2.0


including extra CvT, CvT-SSD, VGG-ASSD models





为了符合开源号召,本项目于2021-7-12 正式开源...

project architecture:

显示失败 (CvT-ASSD文件示例)


  1. You may probably need to install an anaconda environment which contains all packages followed.

    • pytorch 1.9.0 py3.7_cuda10.2_cudnn7_0 pytorch
    • cudatoolkit 10.2.89 h74a9793_1
    • opencv-python pypi_0 pypi
    • visdom pypi_0 pypi
    • yacs 0.1.8 pypi_0 pypi
    • jupyter 1.0.0 pypi_0 pypi
  2. For training, an NVIDIA GPU is strongly recommended for speed. we use two NVIDIA GTX-1080TI, but we recommend GPUs like Tesla-V100 /RTX-3090 for more memory

  3. Before you run the codes for self-study or reappearance the performance in this paper "CvT-ASSD", please add the CvT_SSD/model/ directory into sources Root caused by the reference of many codes inside of model directory

  4. you should download the pytorch parameters file postfix by ".pth" and move into models/CvT/weights like 项目结构.PNG

  5. 图像物体检测benchmark(参照论文native-SSD)一般是将VOC2007—TEST的数据作为模型的测试集,训练集可有以下搭配:

      1. 07:VOC2007 trainval 训练集验证集
      1. 02+12 VOC2007 trainval + VOC2007 trainval 训练集验证集
      1. 07+12+COCO 在 COCO trainval35k上预训练,然后在07+12上微调
  6. 评价指标maP使用mxnet提供的VOC07MApMetric,将recall分成10等分,继而对所有precision取平均,在对类别去平均,具体参见 https://blog.csdn.net/u014203453/article/details/77598997