Logo Detection

A logo detection system using YOLOv5.

Intro

Brands want to understand who uses their products and how. One way to do so is to use Computer Vision algorithms to detect relevant pictures on social media. Object Detection is the method of detecting objects in images or videos. Our goal here was to train a model able to detect brand logos.

Detection Results

Environment & Requirements
Description
Usage Tips
1. Data preparation
2. Inference and Detection
The contributors
License

1. Environment

We developed our codes using Google Colab, and then we trained the largest ones on an Azure Virtual Machine:

To use the package, follow the guidline bellow:

Install python version 3.8
Install the latest version of PyTorch with Cuda 11.3 enabled from the official website
Install the required packages:

    $ git clone https://github.com/Kasrazn97/Logo_Detection
    $ cd Logo_Detection
    $ pip install -r requirments.txt

Alternative: You can also recreate the conda environment used for this project with the following steps:

Clone the repository
cd Logo_Detection
conda env create -f environment.yml
conda activate yolonew

2. Description

We used two model architectures from the YOLOv5 family : YOLOv5s, YOLOv5l. Experimenting with dataset preprocessing steps and fine-tuning procedures, we produced 5 different models and compared their performance.

We chose YoloV5 since it is state-of-the-art model and is considered to be one of the best in terms of speed and accuracy trade-off. Here the details of the two models we chose to train (you can find the details of all the YOLOv5 models here):

Model	size ^(pixels)	mAP^val 0.5:0.95	mAP^val 0.5	Speed ^{CPU b1 (ms)}	Speed ^{V100 b1 (ms)}	Speed ^{V100 b32 (ms)}	params ^(M)	FLOPs ^{@640 (B)}
YOLOv5s	640	37.2	56.0	98	6.4	0.9	7.2	16.5
YOLOv5l	640	48.8	67.2	430	10.1	2.7	46.5	109.1

1. Dataset

The raw dataset we deployed consists of images representing the following logos: Nike, Adidas, Under Armour, Puma, The North Face, Starbucks, Apple Inc., Mercedes-Benz, NFL, Coca-Cola, Chanel, Toyota, Pepsi, Hard Rock Cafè. We used Roboflow to convert dataset to a COCO format and apply preprocessing steps which include image resize and data augmentation. In particular we applied the following augmentations: rotation, blur, flip, shear, exposure, mosaic, crop (both at image and bounding box levels).

2. Training

Our final models differ both in the input data used and the training steps applied.

YOLOv5s (version 1): trained on the raw dataset to which we added augmentations. We kept the 10 backbone layers frozen and fine-tuned the rest. Since the model results were unsatisfatory, we manually cleaned the data by removing the poorly annotated images. Around 40k images used.
YOLOv5s (version 2): cleaned dataset with augmentations (about 20k images in total). Again, we trained all layers except for the backbone.
YOLOv5s (version 3): cleaned dataset with extra augmentation steps, for a total of around 60k images fine-tuned on model version 2 in the last step. Only 6 last layers were trained, thus keeping 18 frozen.
YOLOv5s (version 4): combined dataset from step 2 and step 3. Tuning all the layers except for the backbone.
YOLOv5l: combined dataset from step 2 and step 3, adding more augmentation steps. Images in the training and validation set summed up to around 90k. Again, we trained all the layers except for the backbone.

You can download the weitghs of all 5 models fomr here.

3. Evaluation

We used 2 different metrics to evaluate our model:

IoU
mAP

IoU, Intersection over Union, is an evaluation metric used to evaluate the goodness of an object detector by measuring the overlap between two bounding boxes. mAP, mean Average Precision, of the model measures the Average Precision (computed by calculating the AuC for a particular class) averaged over all the classes .

Here the average results for each model:

Model	mAP^val 0.5	mAP^val 0.5:0.95
YOLOv5s - v1	0.598	0.364
YOLOv5s - v2	0.851	0.662
YOLOv5s - v3	0.846	0.563
YOLOv5s - v4	0.881	0.664
YOLOv5l	0.943	0.713

YOLOv5s - v1	YOLOv5s - v3	YOLOv5l

Here the results for YOLOv5l for each logo:

Logo	mAP^val 0.5	mAP^val 0.5:0.95	IoU >50% confidence	IoU >10% confidence
Adidas	0.98	0.753	0.873	0.897
Apple Inc.	0.981	0.761	0.896	0.902
Chanel	0.981	0.678	0.797	0.873
Coca Cola	0.886	0.619	0.786	0.853
Hard Rock Cafè	0.957	0.743	0.859	0.883
Mercedes Benz	0.984	0.789	0.915	0.924
NFL	0.965	0.731	0.876	0.892
Nike	0.959	0.706	0.868	0.889
Pepsi	0.942	0.676	0.843	0.860
Puma	0.925	0.667	0.793	0.865
Starbucks	0.975	0.823	0.916	0.924
The North Face	0.975	0.741	0.887	0.907
Toyota	0.961	0.737	0.883	0.903
Under Armour	0.977	0.71	0.867	0.880

3. Usage Tips

Weights:

Download the trained models results and their weights from here. Create a Folder called Assets inside project folder and put the downloaded folder inside it.

Logo_Detection|
              |--Assets|
                       |--Models|
                                |--yolov5l_extra_cleanData
                                ...

Data preparation:

Train and Inference:

Clone the repository on your local machine.
cd *Logo_Detection*
Create a folder name Assets
cd *Assets*
Create a folder name dataset
Put your Yolo formated data based on the following structure:

--Assets|
        |--dataset|
                  |--train|
                  |        |--images
                  |        |--labels
                  |--valid
                  |        |--images
                  |        |--labels
                  |--test
                  |        |--images
                  |        |--labels

Detection:

Clone the repository on your local machine.
cd Logo_Detection
Create a folder name Assets
cd Assets
Create a folder name testnow
Put all your images you want to do inference on under testnow folder

Training:

Put the related your_data.yaml file on yolov5/data (see the example logos_yolo5.yaml)
cd yolov5
Run the following command (you can change the model name or any other settings you want):

python train.py --batch-size 32 --weights yolov5s.pt --data your_data.yaml --epochs 50 --hyp hyp.finetune.yaml --freeze 10

For more information regarding the data prepration refer to: https://github.com/ultralytics/yolov5/wiki/Train-Custom-Data

Inference and detection:

Once our algorithm has finished training, we can evaluate its performance on a test (you can download it from here, or retireve the image names and their prediction from test_results.csv) set. Follow the instructions provided in the points below to run the algorithm and retrieve the results of both image.jpg with a bounding box around the prediction, and its respective image.txt label describing the detected classes and their respective bounding box in a format (class_id, x_cen, y_cen, width, height, confidence):

Open your terminal
Activate the environment you installed the requirments.txt on
Open detect_batch.sh with a text editor, and change the variable Modelname according to the specific model you're evaluating (exact names are specified inside detect_batch.sh).
See the results under Assets/outputs/<Modelname> . There you're going to be provided with a folder containing all the images, and in that same folder there is going to be another folder called "labels" containing all the predicted labels.

4. The contributors

5. License

This project is licensed under the GNU General Public License v3.0 found in the LICENSE file in the root directory of this source tree.

ttungl/Logo_Detection