👉 Convert object detection dataset format 👈
Dataset types
PASCAL VOC
: Pascal voc dataset have a xml file for each image.
YOLO
: YOLO dataset have a txt file for each image.
COCO
: COCO dataset have a json file for each image.
Current support format
Currently, the following formats are supported:
from | to | implemented |
---|---|---|
PASCAL VOC | YOLO(TXT files) | Yes |
YOLO | PASCAL VOC (XML files) | Yes |
Upcoming support format
from | to | Issue/PR(if any) |
---|---|---|
PASCAL VOC | COCO (JSON files) | No |
PASCAL VOC | TFRecord (TFRecord files) | No |
COCO | PASCAL VOC (XML files) | No |
COCO | YOLO (TXT files) | No |
COCO | TFRecord (TFRecord files) | No |
YOLO | COCO (JSON files) | No |
YOLO | TFRecord (TFRecord files) | No |
Installation
Installation from source code
git clone https://github.com/codePerfectPlus/dataset-convertor/
cd dataset-convertor
python -m venv venv
source venv/bin/activate
pip install requirements.txt
Installation from PyPI
pip install dataset-convertor
Usage
convert annotations from one format to another format.
dataset formatting example:
- data/pascal_voc/JPEGImages/*.jpg
- data/pascal_voc/Annotations/*.xml
- data/yolo5/JPEGImages/*.jpg
- data/yolo5/labels/*.txt
Pascal VOC(xml) to yolo(txt)
from convert import Convertor
con = Convertor(input_folder='/home/user/data/pascal_voc', output_folder='/home/user/data/yolo5')
con.voc2yolo()
from yolo(txt) to Pascal VOC(xml)
from convert import Convertor
con = Convertor(input_folder='/home/user/data/yolo5', output_folder='/home/user/data/pascal_voc')
con.yolo2voc()
Contributing
create an issue/PR if any format is missing.Open-source contribution is welcome.check the contributing guide for details.
Reference
- PASCAL VOC - http://host.robots.ox.ac.uk/pascal/VOC/
- COCO - http://cocodataset.org/
- YOLO9000 - https://arxiv.org/abs/1612.08242
- YOLO4 - https://arxiv.org/abs/2004.10934v1