This repository contains code to pre-process datasets containing labelled satellite images of boats for use with a YOLOv5 model.
A list of key developers on the project.
Name | GitHub ID | |
---|---|---|
Harry Moss | @harryjmoss | h.moss@ucl.ac.uk |
Sanaz Jabbari | @sanazjb |
Get your own copy up and running by on Linux/macOS by following these simple steps:
- Clone the repository
git clone https://github.com/UCL-RITS/vetii
- Install dependencies (preferably within a virtual environment)
cd vetii/
python -m venv your_virtual_environment_name_here
source your_virtual_environment_name_here/bin/activate
pip install --upgrade pip
pip install -r requirements/base.txt
- Clone the YOLOv5 model repository (with VETII settings and modifications) within the VETII repository
git clone https://github.com/harryjmoss/yolov5.git
- Install model dependencies
cd yolov5
pip install -r requirements.txt
The approach taken in the VETII project was as follows:
Data preprocessing:
- Pre-process all data, generate labels in the correct format, split datasets into train/validation sets and place into a suitable file structure.
Training:
- Use transfer learning to train an existing pre-trained model on a large dataset of aerial images of boats. A subset of the MASATI-v2 dataset was chosen for training. After an initial literature search and comparison of three models, YOLOv5x was chosen as the candidate model.
- Using the output weights from step 1, further train the model using a curated dataset with satellite images of Sassoon Dock, Mumbai. Images were generated using Google Earth Pro and annotated with labelImg
Each step is described in detail below.
The data provided within this project, and available externally, requires some preprocessing before it can be used to train YOLOv5 models. The steps described in this section are not necessary if you're using the model to detect boats within images!
In the following, $VETII
refers to the directory containing this README.
For MASATI
cd
into the MASATI-v2 data preprocessing directory
cd $VETII/data_preprocessing/masati/
Request the dataset from the researchers here (google form to request the dataset from the researchers) - you will receive a link to a google drive file with a url like
https://drive.google.com/file/d/<GOOGLE-DRIVE-ID>/view
Extract <GOOGLE-DRIVE-ID>
from the link above. Download and extract the dataset with
gdown https://drive.google.com/uc?id=<GOOGLE-DRIVE-ID>
unzip MASATI-v2.zip
rm MASATI-v2.zip
the dataset will be extracted to $VETII/data_preprocessing/masati/MASATI-v2
All stages of data preprocessing can then be run with
dvc repro
By default, this uses a 75:25 train:validation dataset split. To change this, or any other parameter in the data preprocessing, see params.yaml
.
After data preprocessing, the current directory will look like
$VETII/data_preprocessing/masati/
├── MASATI-v2/ # original extracted folder
├── MASATI-v2.zip
├── PNGImages/ # all masati images
├── __init__.py
├── coco_json_annotations/ # masati labels in a json format
├── convert_masatixml_to_json_labels.py
├── convert_xml_to_yolotxt.py
├── dvc.lock
├── dvc.yaml
├── filename_mapping.csv
├── masati_instructions.md
├── modified_xml_annotations/ # corrected masati xml labels
├── params.yaml
├── refine_masati.py
├── remove_extra_classes.py
├── sort_yolo_images.py
├── train/ # training image set
├── train_test_split.py
├── validation/ # validation image set
├── xml_annotations/ # uncorrected masati xml labels
├── xml_annotations_test.txt
├── xml_annotations_train.txt
├── yolo_annotations/ # annotations in YOLO .txt format
├── yolo_images_train/ # renamed training set images for YOLOv5 compatibility
├── yolo_images_validation/ # renamed validation set images for YOLOv5 compatibility
├── yolo_labels_train/ # training set labels
└── yolo_labels_validation/ # validation set labels
and the $VETII/data/masati
directory will contain files in the necessary structure for use with YOLOv5:
$VETII/data/masati
├── images
│ ├── train
│ └── val
└── labels
├── train
└── val
For Sassoon Dock dataset
cd
into the Sassoon Dock data preprocessing directory
cd $VETII/data_preprocessing/sassoon/
Download the dataset - currently available through the VETII sharepoint site and UCL RDS - and place it in the data preprocessing directory. For access to the dataset on UCL RDS, contact one of the repository owners.
Extract the dataset with
tar -xzvf sassoon_dock-v2.tar.gz
Run the data preprocessing with
dvc repro
Files will be placed within $VETII/data/sassoon
, which should look like
$VETII/data/sassoon
├── images
│ ├── train
│ └── val
└── labels
├── train
└── val
After cloning the VETII YOLOv5 fork inside the $VETII
directory and installing the required dependencies, model training is performed using the following commands
Note: the runs/train/expN
directory will change as a function of N
training runs
- Training the model with the MASATI-v2 dataset
Use the following commands to recreate the training used during the VETII project (two training runs with 50 epochs each), or use a single command training on 100 epochs. Results should be similar.
cd $VETII/yolov5
python train.py --img 512 --batch 8 --epochs 50 --data masati.yaml --weights yolov5x.pt --workers 0
cp runs/train/exp/weights/best.pt weights/masati_yolov5x_50epoch_training.pt
python train.py --img 512 --batch 16 --epochs 50 --data masati.yaml --weights weights/masati_yolov5x_50epoch_training.pt --workers 0
cp runs/train/exp2/weights/best.pt weights/masati_yolov5x_100epoch_best.pt
Results finished in <8 hours using a V100 GPU.
- Training the model with the Sassoon Dock dataset
Recover the model weights used for inference in the VETII project with the following commands:
cd $VETII/yolov5
python train.py --img 1175 --rect --batch 2 --epochs 100 --data sassoon.yaml --weights weights/masati_yolov5x_100epoch_best.pt --workers 0
cp runs/train/exp3/weights/best.pt weights/sassoon100epoch_best_trained_on_masati_yolov5x_100epoch.pt
python train.py --img 1175 --rect --batch 8 --epochs 50 --hyp data/hyp.sassoon.yaml --data sassoon.yaml --weights weights/sassoon100epoch_best_trained_on_masati_yolov5x_100epoch.pt --workers 0
cp runs/train/exp4/weights/best.pt weights/sassoon_dock_final_weights.pt
You should then have an equivalent set of weights to the ones found on the VETII sharepoint here.
At this point you will have either produced a set of trained weights by following the previous steps, or will have cloned this repository and obtained the pre-trained weights provided. If you want to train the model on an additional dataset at this point (more images you've obtained for a different dock, maybe...) then it's vital that you pre-process the data correctly. The automated pipeline set out in the data preprocessing section contains the steps required to properly format the Sassoon dock and MASATI-v2 datasets for the yolov5 model, but will need to be adapted for additional datasets. The steps to follow are:
- Gather satellite images using Google Earth (or similar). Copyright watermarks are included by default, but it's also useful to include a distance scale and compass orientation marker to standardise the images in your dataset. For the Sassoon dock dataset, images are orientated with the North compass point facing vertically upwards on the image and all images are taken at the same altitude. You may want to experiment with taking images at different altitudes so that the model is able to learn extra features in the images.
- Label the images with labelImg (open source!) set to XML format. The software allows output directly to YOLO-txt format, but saving as XML retains image metadata that is used later.
- The entire data preprocessing process is contained in several scripts within
data_preprocessing/scripts
. These scripts are used for the Sassoon dock and MASATI-v2 datasets by the DVC pipeline, so shouldn't be moved around! For a dataset that you might create similar to the Sassoon dock dataset, data preprocessing consists of- Convert xml labels to YOLO (.txt) style labels
- Split the dataset into a training and test set, usually 80:20
- Rename image and annotation files as sequential integers (expected by the YOLOv5 repository)
- Create the correct directory structure expected by the YOLOv5 repository.
All four stages are handled with Python scripts located within data_preprocessing/scripts
. For an example dataset, do the following:
- Create a directory within
data_preprocessing/
for your dataset, e.g.data_preprocessing/your_dataset
. Move images into animages
directory and xml image annotation files into axml_annotations
directory withindata_preprocessing/your_dataset
- Copy the
params.yaml
file from thedata_preprocessing/sassoon
directory . This assumes you're working on a similar dataset to the Sassoon Dock dataset, that you want a 75:25 train:test split and that your images are.jpg
. Change the dataset name to something sensible - by default it's set tosassoon
. - Run
python ../scripts/convert_xml_to_yolotxt.py
to create a directoryyolo_annotations
that contains YOLO (.txt) format image labels - Run
python ../scripts/train_test_split.py
to split the dataset into training and test sets, with a default train:test split of 75:25 defined inparams.yaml
. - Run
python ../scripts/sort_yolo_images.py
to rename image and annotation files and arrange them into directories. - Run
python ../scripts/prepare_yolov5.py
- this places the dataset in thedata/your_dataset
directory (if you chose that as the name in theparams.yaml
file!)
At this stage your dataset is set up in the same way that the above DVC pipelines will set up the Sassoon Dock and MASATI-v2 datasets, and should be ready for use with YOLOv5!
... or, detecting boats with the trained model
This step requires the use of either the model weights provided the VETII sharepoint or produce your own by following the steps in the training section. These weights should be placed in the $VETII/yolov5/weights
directory.
To detect boats within images (of jpg
or png
format), place them within the $VETII/yolov5/data/images
directory. For example, to detect boats in images of the Sassoon Dock, use the following command for images in the $VETTI/yolov5/data/images/sassoon/
directory
$ python detect.py --weights weights/sassoon_wandb_best_150epoch_trained_on_masati100epoch_yolov5x.pt --img 1175 --conf 0.25 --source data/images/sassoon --iou-thres 0.5 --save-txt --save-conf
Arguments used:
img
: the image width in pixelsconf
: confidence threshold (~prediction probability threshold)source
: location of an image file or image directoryiou-thres
: Intersection over union (IoU) (or Jaccard index) threshold applied during detection - this is the minimum IoU scoresave-txt
: If present, saves the coordinates of the centre coordinate, width and height of the bounding boxes around detected boats. For VETII this should always be included!save-conf
: Saves confidence scores in the final column of the text output Include this for VETII!
Outputs can be found in the runs/detect
directory. An example output directory for a single image detection should look something like
├── 00001.jpg
├── centre_dot_00001.jpg
└── labels
└── 00001.txt
The output text file format for a single input image will look like
Object_ID Class x y width height confidence
For the pre-trained MASATI+Sassoon model, Class
will always equal zero as the model was trained on a single class of object, boat
. The Object_ID
is unique to each detected object in the image, and is also written onto the corresponding output centre_dot_
image.
A great place to get started is to first take a look at any open issues.
If you spot something else and would like to work on it, please feel free to create an issue.
Once you've found something to work on, the suggested workflow is:
- Fork this repository
- Create your new feature in a branch (
git checkout -b feature/MyNewGreatFeature
) - Commit your changes with a descriptive and helpful commit message (
git commit -m 'Adding MyNewGreatFeature'
) - Push your changes to your forked remote (
git push my_vetii_fork feature/AmazingFeature
) - Open a pull request to merge your changes into the
main
branch.
Python code should be PEP8 compliant where possible, using black to make life easier.
Before being merged into main
, all code should have well writen documentation, including the use of docstrings. Adding and updating existing documentation is highly encouraged.
Optional, but recommended - gitmoji for an emoji:commit message dictionary. This also includes an optional gitmoji-cli as a hook so you remember when you write commits!
The branch naming convention is iss_<issue-number>_<short_decription>
, where <issue-number>
is the issue number and <short_description>
is a short description of the issue.
Contributing to existing to tests or adding new ones will always be well received! Please include tests in any contributed code before requesting to merge, if appropriate.