RCAIT

Rapid Classification AI Trainer

Welcome to RCAIT!

This is the repository for the
Rapid Classification AI Trainer Project (PSE SS 2021)
developed for Fraunhofer IOSB.

Description

In order to train Deep learning based models, a lot of training data is required. Collecting and annotating this training data can take a lot of time and resources. The goal of the Rapid-Classification AI Trainer PSE project is a desktop application that uses various search terms to load images from the Internet and starts training a deep learning model with them. Using the search terms, the images already have a class assignment and no longer need to be annotated. A central requirement of the project is a high degree of modularity. Both the loading of the images and the AI training are to be implemented via plug-ins, so that new data sources and new training methods can be added later.

The implementation will take place in C⁺⁺ and Qt. Git is used for code management.

Getting Started

The program's tab structure guides you through the process. Not all steps have to be taken. The main steps are:

Create and/or select a project.
Create and/or select a model.
Download additional images with a plugin of your choice.
Inspect the images.
Start a training.
Classify some images.
Check training and classification results.

Fulfilled Criteria

Mandatory Criteria

Optional Criteria

Included plugins

Disclaimer

This program lets you download lots of images from Google, Bing and other search engines. By using it, you agree to respect any intellectual or copyright related property rights.

Search indexes merely index images and allow you to find them. They do NOT produce their own images and, as such, they don't own copyright on any of them. The original creators of the images own the copyrights.

Use this program only for educational purposes.

Flickr Plugin

Uses the Flickr API through the flickrapi python library. API key and secret can be obtained from Flickr by following the steps outlined in their API guide.

The python package can be obtained by running pip install flickrapi.

Bing Plugin

Download images from the Microsoft Bing search engine with the bing-image-downloader library. Can be obtained by running pip install bing-image-downloader. It is preferable to download the fork which downloads fewer duplicates. Install with pip install . in directory with setup.py.

Folder Plugin

Load images from a specified folder into the project. Folders are specified in the settings.

Google Plugin

Download images from the Google search engine with the Google-Images-Search python library. Follow instructions there to register a custom search engine. Runs under Linux because it's dependent on curses library ( see this)

MMClassification Plugin

MMClassification is a toolbox for image classification based on pytorch and part of the open source project OpenMMLab.

Installation

A detailed installation guide can be found in the documentation of MMClassification. The application was tested with version 0.15.0 and 0.16.0. A detailed list with all requirements can be found under Documentation/requirements.txt.

data augmentation with albumentations For data augmentation albumentations is needed, which can be installed with pip install -U albumentations.
webp support
For webp images webp support must be installed with pip install webp and in the imagenet dataset class the file extension .webp must be added to the allowed image extensions in IMG_EXTENSIONS.

Directory structure for config and checkpoint files

This plugin uses the existing structure of the mmclassification repository and is described in its Tutorial . The directory structure can be found at Documentation/MMClassificationDirectoryStructure. It contains the two folders configs and checkpoints and can be integrated into the repository or can stay seperated from it. New configs will be stored in the configs' folder. In the checkpoints' folder only the default checkpoint files will be stored. The user generated checkpoint files can be found in the working directory of the corresponding project directory. Due to the size of the checkpoints files, the following files have to be downloaded in the checkpoints' folder. In connection with the corresponding config file we call them base models because these on ImageNet pretrained models are the starting point for the actual training process. The supported base models are: ResNet-50 , ResNet-101 , ResNeXt-32x8d-101 , SE-ResNet-50 and MobileNetV3 Large . The path to this structure must be specified in the settings of the plugin under MMClassification path.

Required modifications to mmcls

Creation of the confusion matrix
A confusion matrix will be generated due to the specified support metric in the config file, but to save its data for further use the following code must be added to eval_metrics.py after confusion_matrix = calculate_confusion_matrix(pred, target).
```
torch.set_printoptions(profile="full")    
matrix = confusion_matrix.data.tolist()
with open("PLACEHOLDER PATH TO MMClassification Directory/data_confusion_matrix.json", 'w') as outfile:
    json.dump(matrix, outfile)
torch.set_printoptions(profile="default")
```
The placeholder must be replaced with the path to mmclassification. The path must be same as the one specified in the MMClassification plugin settings under mmclassification path and the name must be data_confusion_matrix.json.

Creation of the annotation files
Annotation files are needed when classifying images with the test script. Otherwise, the order of the classification results would be unclear and so the input images would not be matched with the correct, corresponding result. This dataset class ensures, that the input file will be read lexicographically and that an annotation file in the input directory will be generated. First the following file must be added to the dataset directory of mmcls and its class name must be added in the __ init__.py file.

import mmcv
import numpy as np

from .builder import DATASETS
from .base_dataset import BaseDataset
from.imagenet import find_folders, get_samples
from tkinter import Tcl

@DATASETS.register_module()
class LexicographicallySorted(BaseDataset):
    IMG_EXTENSIONS = ('.jpg', '.jpeg', '.png', '.ppm', '.bmp', '.pgm', '.tif', '.webp')

    def load_annotations(self):
    if self.ann_file is None:
        folder_to_idx = find_folders(self.data_prefix)
        samples = get_samples(
            self.data_prefix,
            folder_to_idx,
            extensions=self.IMG_EXTENSIONS)
        if len(samples) == 0:
            raise (RuntimeError('Found 0 files in subfolders of: '
                                f'{self.data_prefix}. '
                                'Supported extensions are: '
                                f'{",".join(self.IMG_EXTENSIONS)}'))

        self.folder_to_idx = folder_to_idx
    elif isinstance(self.ann_file, str):
        with open(self.ann_file) as f:
            samples = [x.strip().rsplit(' ', 1) for x in f.readlines()]
    else:
        raise TypeError('ann_file must be a str or None')

    #sort samples lexicographically
    self.samples = Tcl().call('lsort', '-dict', samples)

    data_infos = []
    path = self.data_prefix + '/val.txt'
    with open(path, 'w') as f:
        for filename, gt_label in self.samples:
            info = {'img_prefix': self.data_prefix}
            info['img_info'] = {'filename': filename}
            info['gt_label'] = np.array(gt_label, dtype=np.int64)
            data_infos.append(info)
            annotationFileLine = self.data_prefix + "/" + filename + " " + str(gt_label) + "\n"
            f.write(annotationFileLine);
    return data_infos

The dataset can then be used in the dataset config by navigating to mmclassification/configs/datasets/default_dataset.py and changing the line in test = dict( from type = dataset_type to type = 'LexicographicallySorted'.

Fix log error to allow the creation of the accuracy curve
In the file mmcls/apis/train.py the priority of IterTimerHook must be changed from 'NORMAL' to 'LOW'. Otherwise, the validation step will be documented incorrectly in the log.json file and the accuracy curve won't be generated.

Remote execution

Connect to the remote server using a terminal like MobaXTerm.

The program can then be started on the remote by using the command

./RCAIT -platform vnc:size=1920x1080

The Qt qVNC plugin then starts a VNC server on port 5900 by default. The supplied resolution is optional, but ensures that the window will be shown completely. Thereafter, you can view and interact with the GUI of the application from your machine with a VNC viewer of your choosing. We recommend TightVNC or RealVNC. Depending on your setup, port forwarding is also required.

You can also play around with scaling options if the window is too small. If the bound port is already in use, you can select a custom port as well.

Example

QT_AUTO_SCREEN_SCALE_FACTOR=0 QT_SCALE_FACTOR=1.5 ./RCAIT -platform "vnc:size=1920x1080:port=1234"

Translations

This project has been equipped with full language options from the beginning. There are several advantages, on the one hand hardcoded texts and variables can be centralized and avoided, on the other hand the program is accessible for more people and can be extended with more languages without effort.

Tutorial

To find new texts in the program it is necessary to set the option set(CREATE_NEW_TRANSLATIONS OFF) in the CMakeLists.txt file to ON. At the next compilation the message "Creating new translations..." will appear. The .ts files are then updated in the languages folder. These can now be

filled manually with the Qt Linguist (boring)
filled automatically. <(waaay faster)
We have developed this method ourselves.

Enter the languages directory
Drag the .ts file onto ts2txt.py
Translate the received .txt file with a translator of choice e.g. DeepL
Paste translations over the .txt file's content
Drag the .txt file onto txt2ts.py
Done!

Info: If texts created during development are deleted, their translations are not deleted, but only marked as obsolete. This can make the language file very large.

Solution: Drag the .ts file onto cleanup_obsolete.py and the problem is solved!

Build

The program can be built with CMake and was developed and tested with Qt 6.1.2.

Rare Cases & Fixes

QSettings Crash

It might (very rarely but still) happen, that the QSettings Registry object is corrupted during writing/reading from the system. If that happens, the program will not start any more and the debugger will only show SIG_SEGFAULT at 0x0000000 This is not a fault of our code and there is nothing we can do to fix it, because it happens before our program is even loaded.

Workarounds:

Under Linux, delete the file /home/ies/<username>/.config/PSE2021/RCAIT.conf
Under Windows, delete the registry entry Computer\HKEY_CURRENT_USER\Software\PSE_SS21\RCAIT
Restart the application then.

License

RCAIT is released under the LGPLv3 License

( GNU LESSER GENERAL PUBLIC LICENSE Version 3 )

Joensw/RCAIT