/insect-detect-ml

Notebooks for detection and classification model training. Insect classification model. Python scripts for processing of data, collected with the Insect Detect DIY camera trap.

Primary LanguageJupyter NotebookGNU Affero General Public License v3.0AGPL-3.0

Insect Detect ML - Model training and data processing

DOI PLOS ONE License: AGPL v3 DOI Zenodo

This repository contains Jupyter notebooks that can be used to train custom YOLOv5, YOLOv6, YOLOv7 and YOLOv8 object detection models or a custom YOLOv5 image classification model. All notebooks can be run in Google Colab, where you will have access to a free cloud GPU for fast training without special hardware requirements.

The Python script for classification of the captured insect images is available in the custom yolov5 fork and can be used together with the provided insect classification model.

Use the process_metadata.py script for post-processing of metadata .csv files with classification results.


Model training

You can find more information about detection model training at the Insect Detect Docs 📑.

  • YOLOv5 detection model training   Open In Colab

  • YOLOv6 detection model training   Open In Colab

  • YOLOv7 detection model training   Open In Colab

  • YOLOv8 detection model training   Open In Colab

    The PyTorch model weights can be converted to .blob format at tools.luxonis.com for on-device inference with the Luxonis OAK devices.

 

You can find more information about classification model training at the Insect Detect Docs 📑.

  • YOLOv5 classification model training   Open In Colab

    The notebook for classification model training includes export to ONNX format for faster CPU inference.


Classification

The modified classification script in the custom yolov5 fork includes the following added options:

  • --sort-top1 sort the classified images to folders with the predicted top1 class as folder name
  • --sort-prob sort images first by probability and then by top1 class (requires --sort-top1)
  • --concat-csv concatenate all metadata .csv files and append classification results to new columns

More information about deployment of the classification script can be found at the Insect Detect Docs 📑.

 

Classification model

Model
(.onnx)
size
(pixels)
Top1 Accuracytest
Precisiontest
Recalltest
F1 scoretest
EfficientNet-B0 128 0.972 0.971 0.967 0.969

Table Notes

  • The model was trained to 20 epochs with image size 128, batch size 64 and default settings and hyperparameters. Reproduce the model training with the provided Google Colab notebook.
  • Trained on Insect Detect - insect classification dataset v2 with 27 classes. To reproduce the dataset split, keep the default settings in the Colab notebook (train/val/test ratio = 0.7/0.2/0.1, random seed = 1).
  • Dataset can be explored at Roboflow Universe. Export from Roboflow compresses the images and can lead to a decreased model accuracy. It is recommended to use the uncompressed dataset from Zenodo.
Full model metrics on dataset test split (click to expand)
Class Images Top1 Accuracytest
Precisiontest
Recalltest
F1 scoretest
all 2125 0.972 0.971 0.967 0.969
ant 111 1.0 0.991 1.0 0.996
bee 107 0.963 0.972 0.963 0.967
bee_apis 31 1.0 0.969 1.0 0.984
bee_bombus 127 1.0 0.992 1.0 0.996
beetle 52 0.885 0.92 0.885 0.902
beetle_cocci 78 0.987 1.0 0.987 0.994
beetle_oedem 21 0.905 0.905 0.905 0.905
bug 39 0.846 1.0 0.846 0.917
bug_grapho 19 1.0 1.0 1.0 1.0
fly 173 0.971 0.944 0.971 0.957
fly_empi 19 1.0 1.0 1.0 1.0
fly_sarco 33 0.909 0.938 0.909 0.923
fly_small 167 0.958 0.952 0.958 0.955
hfly_episyr 253 0.996 0.996 0.996 0.996
hfly_eristal 197 0.99 0.995 0.99 0.992
hfly_eupeo 137 0.985 0.993 0.985 0.989
hfly_myathr 60 1.0 1.0 1.0 1.0
hfly_sphaero 39 0.974 1.0 0.974 0.987
hfly_syrphus 50 0.98 1.0 0.98 0.99
lepi 24 1.0 0.96 1.0 0.98
none_bg 86 0.988 0.966 0.988 0.977
none_bird 8 1.0 1.0 1.0 1.0
none_dirt 85 0.976 0.902 0.976 0.938
none_shadow 66 0.924 0.953 0.924 0.938
other 79 0.861 0.883 0.861 0.872
scorpionfly 12 1.0 1.0 1.0 1.0
wasp 52 1.0 1.0 1.0 1.0
Full model metrics on dataset validation split (click to expand)
Class Images Top1 Accuracyval
Precisionval
Recallval
F1 scoreval
all 4189 0.98 0.979 0.974 0.976
ant 219 0.995 0.995 0.995 0.995
bee 212 0.967 0.958 0.967 0.962
bee_apis 58 1.0 0.967 1.0 0.983
bee_bombus 252 1.0 0.996 1.0 0.998
beetle 104 0.933 0.942 0.933 0.937
beetle_cocci 155 1.0 1.0 1.0 1.0
beetle_oedem 39 0.897 0.972 0.897 0.933
bug 78 0.949 0.961 0.949 0.955
bug_grapho 37 1.0 1.0 1.0 1.0
fly 343 0.983 0.939 0.983 0.96
fly_empi 35 1.0 0.972 1.0 0.986
fly_sarco 63 0.841 0.964 0.841 0.898
fly_small 332 0.97 0.982 0.97 0.976
hfly_episyr 503 0.996 0.996 0.996 0.996
hfly_eristal 390 1.0 1.0 1.0 1.0
hfly_eupeo 271 0.989 0.993 0.989 0.991
hfly_myathr 118 0.992 1.0 0.992 0.996
hfly_sphaero 74 1.0 0.987 1.0 0.993
hfly_syrphus 97 1.0 0.99 1.0 0.995
lepi 45 0.978 0.978 0.978 0.978
none_bg 170 0.988 0.982 0.988 0.985
none_bird 13 1.0 1.0 1.0 1.0
none_dirt 167 0.982 0.976 0.982 0.979
none_shadow 129 0.969 0.984 0.969 0.977
other 158 0.88 0.903 0.88 0.891
scorpionfly 24 1.0 1.0 1.0 1.0
wasp 103 0.99 1.0 0.99 0.995


Metadata post-processing

Install the required packages by running:

python.exe -m pip install -r requirements.txt

Or use the Python Launcher for Windows with:

py -m pip install -r requirements.txt

The process_metadata.py script can be used to automatically post-process the concatenated metadata .csv file after the classification step, as it will still contain multiple rows for each tracked insect.

The output of the script includes a *top1_final.csv file in which each row corresponds to an individual tracked insect and its classification result with the highest weighted probability. Additionally, several plots are generated that can give a first overview of the processed metadata.

More information about deployment of the post-processing script can be found at the Insect Detect Docs 📑.


Image processing

The process_images.py script can be used to calculate different metrics of the captured images (e.g. mean/median/min/max width/height) and remove corrupted .jpg images from the data folder (camera trap output). These can be rarely generated by the automated monitoring script (e.g. in power outage situations) and will cause an error while running the classification script.


License

This repository is licensed under the terms of the GNU Affero General Public License v3.0 (GNU AGPLv3).

Citation

If you use resources from this repository, please cite our paper:

Sittinger M, Uhler J, Pink M, Herz A (2024) Insect detect: An open-source DIY camera trap for automated insect monitoring. PLoS ONE 19(4): e0295474. https://doi.org/10.1371/journal.pone.0295474