/datum

An automated data pipeline to create pre-trained detection models for transfer learning

Primary LanguagePythonMIT LicenseMIT

datum

datum

An automated data pipeline to create pre-trained detection models for transfer learning utilizing the OpenImages dataset.

Core Components

Prereqs

  • Docker
  • NVIDIA GPU (if training)

Installation

Install script pulls and builds Fiftyone and YOLOv5 images.

utils/Install.sh

Dataset creation

  • Change line 10 of datum/main.py to the classes you want
  • Modify dataset.yaml file in /data folder
# Original
train: ./images/train/
val: ./images/val/
# Correct
train: /data/images/train/
val: /data/images/val/

Training

Use Ultralytics' Docker image to train on dataset.

  • Spin up container
utils/Train.sh
python train.py --img 640 --batch 16 --epochs 100 --data /data/dataset.yaml --weights yolov5s.pt

To-do

  • Make class selection more user friendly
  • Use Docker volumes for dataset
  • Automate dataset.yaml formatting
  • Create full compose stack