Colruyt products detection

This code trains a Faster RCNN architecture with a Resnet 50 backbone to learn to detect and categorize a set of 60 types of Colruyt products (plus the background). The code is written in Pytorch.

How to use

Install environment

To install the conda environment on which I developed the solution:

conda env create -f environment.yml

Then activate it:

conda activate gym_env

Prepare the data

This repository contains no images since it is too heavy to upload, however you can build the expected folder structure by following these steps:

  1. under data folder, create an images folder and populate it with the images of the exercise, there should be exactly 31.000 images
  2. under data folder, launch python3, this script will move the images to 4 different folders:
    • train/images
    • test/images
    • val/images
    • unlabeled/images
  3. That's it you are ready

Prepare the model

Create a model folder in the root of this repo and download the following model:

Generate testing results

The results should be already available under /data/test/result.csv in the format asked by Colruyt team. To re-generate the expected csv results file, launch script, you need to configure the following parameter:

  • model_path: path to your trained model, if it does not exists, a pretrained COCO model is loaded (with poor results)


To train the model, launch the script, you need to configure the following parameters:

  • BATCH_SIZE: this value depends on the GPU memory available (default: 4)
  • EPOCHS: how long do you want to train (default: 200)
  • checkpoint_model: starts the training by loading the weights of this model (if it does not exists, a pretrained COCO model is loaded)
  • new_model: path where to store new model

NOTE: The training set has been extended by manually labeling a set of 300 extra images that were unlabeled and that did not belong to the test set. This has been done using the following repository:

Evaluation on validation set

A custom validation set of 100 difficult images has been labeled manually using the COCO-annotator repo. This validation does not contain images from the test set. You can assess your model by following this steps:

  • Under src, launch git clone
  • And install pycocotools with pip install git+
  • In script, select your model under model_name variable
  • Launch python3
  • A result is generated in the folder data/val/results
  • Go to src/cocoapi/PythonAPI/demos and move evaluation.ipynb notebook to this directory
  • Configure the variables annDir, annFile and resFile according to your machine and your model name
  • lauch evaluation.ipynb notebook


To visualize the model on the validation set, launch, you need to configure the following parameter:

  • model_path: path to your trained model, if it does not exists, a pretrained COCO model is loaded (with poor results)
  • Launch python3
  • Press q to go to the next image

Expected folder structure

├── data
│   ├── images
│   ├── info.docx
│   ├──
│   ├── test
│   │   ├── images
│   │   ├── result.csv
│   │   └── test.json
│   ├── train
│   │   ├── images
│   │   ├── train_info.json
│   │   ├── train.json
│   │   └── train_results.json
│   ├── unlabeled
│   │   ├──
│   │   ├── images
│   │   ├── train_info_ext.json
│   │   └── val_info_ext.json
│   └── val
│   ├── images
│   ├── results
│   ├── val_info.json
│   └── val.json
├── environment.yml
├── model
│   ├──
├── runs
└── src
├── cocoapi