/caltech-pedestrian-dataset-to-yolo-format-converter

converts the format of the caltech pedestrian dataset to the format that yolo uses

Primary LanguagePython

convert the format of the caltech pedestrian dataset to the format that yolo uses

This repo is adapted from

dependencies

  • opencv
  • numpy
  • scipy

how to

  1. Convert the .seq video files to .png frames by running $ python generate-images.py. They will end up in the images folder.
  2. Squared images work better, which is why you can convert the 640x480 frames to 640x640 frames by running $ python squarify-images.py
  3. Convert the .vbb annotation files to .txt files by running $ python generate-annotation.py. It will create the labels folder that contains the .txt files named like the frames and the train.txt and test.txt files that contain the paths to the images.
  4. Adjust .data yolo file
  5. Adjust .cfg yolo file: take e.g. yolo-voc.2.0.cfg and set height = 640, width = 640, classes = 2, and in the final layer filters = 35 (= (classes + 5) * 5))

folder structure

|- caltech
|-- annotations
|-- test06
|--- V000.seq
|--- ...
|-- ...
|-- train00
|-- ...
|- caltech-for-yolo (this repo, cd)
|-- generate-images.py
|-- generate-annotation.py
|-- images
|-- labels
|-- test.txt
|-- train.txt