/Dataset_to_VOC_converter

Scripts to convert datasets (Caltech pedestrian, MS COCO, HDA) to PASCAL VOC format

Primary LanguagePython

These scripts are used for convert datasets (MS COCO, Caltech pedestrian dataset) to PASCAL VOC format for later training.

Requirements

  • Python2.7 (not work normally under python3)
  • MS COCO toolbox
  • cytoolz
  • lmxl
  • scipy, numpy

Usage

COCO

anno_json_image_urls.py: extract image url (coco source not filckr) from annotation json file. See anno_json_image_urls.sh
download_coco_images.py: download coco image files from given urls (extracted from instance/keypoint annotation json file) . See download.sh
anno_coco2voc.py: convert coco annotation json file to VOC xml files. See anno_coco2voc.sh

Caltech

vbb2voc.py: extract images with person bbox in seq file and convert vbb annotation file to xml files.
PS: For Caltech pedestrian dataset, there are 4 kind of persons: person, person-fa, person?, people. In my case, I just need to use person type data. If you want to use other types, specify person_types with corresponding type list (like ['person', 'people']) in parse_anno_file function.

HDA

anno_had2voc: convert HDA annotation info to VOC format.