This repository allows you to get started with training a State-of-the-art Deep Learning model with little to no configuration needed! You provide your labeled dataset and you can start the training right away and monitor it with TensorBoard. You can even test your model with our built-in Inference REST API. Training with TensorFlow has never been so easy.
- This repository is based on Tensorflow Object Detection API
- The tensorflow version used is in this repo is 1.13.1
- We plan on supporting TF 2 as soon as the Tensorflow Object Detection API officially supports it.
- The built-in inference REST API works on CPU and doesn't require any GPU usage.
- All of the supported networks in this project are taken from the tensorflow model zoo
- The pre-trained weights that you can use out of the box are based on the COCO dataset.
- Ubuntu 18.04
- NVIDIA Drivers (410.x or higher)
- Docker CE latest stable release
- NVIDIA Docker 2
- Docker-Compose
To check if you have docker-ce installed:
docker --version
To check if you have docker-compose installed:
docker-compose --version
To check if you have nvidia-docker installed:
nvidia-docker --version
To check your nvidia drivers version, open your terminal and type the command nvidia-smi
-
If you don't have neither docker nor docker-compose use the following command
chmod +x install_full.sh && source install_full.sh
-
If you have docker ce installed and wish only to install docker-compose and perform necessary operations, use the following command
chmod +x install_compose.sh && source install_compose.sh
-
If both docker ce and docker-compose are installed then use the following command:
chmod +x install_minimal.sh && source install_minimal.sh
-
Install NVIDIA Drivers (410.x or higher) and NVIDIA Docker for GPU training by following the official docs
-
Make sure that the
deleteme
files indatasets
andcheckpoints
folder are deleted. (deleteme files are placeholder files used for git)Make sure that the
base_dir
field indocker_sdk_api/api/paths.json
is correct (it must match the path of the root of the repo on your machine).
-
Go to
gui/src/environments/environment.ts
andgui/src/environments/environment.prod.ts
and change the following:- field `url`: must match the IP address of your machine - the IP field of the `inferenceAPIUrl`: must match the IP address of your machine (**Use the`ifconfig`command to check your IP address . Please use your private IP which starts by either 10. or 172.16. or 192.168.**)
environment.ts
environment.prod.ts
-
If you are behind a proxy, change the
args
http_proxy
andhttps_proxy
inbuild.yml
to match the address of your proxy. (you can find build.yml in the repo's root directory)
The following is an example of how a dataset should be structured. Please put all your datasets in the datasets folder.
├──datasets/
├──dummy_dataset/
├── images
│ ├── img_1.jpg
│ └── img_2.jpg
├── labels
│ ├── json
│ │ ├── img_1.json
│ │ └── img_2.json
│ └── pascal
│ ├── img_1.xml
│ └── img_2.xml
└── objectclasses.json
PS: you don't need to have both json and pascal folders. Either one is enough
- If you want to label your images, you can use LabelImg which is a free, open-source image annotation tool. This tool supports XML PASCAL label format
You must include in your dataset an objectclasses.json file with a similar structure to the example below:
To build the solution, run the following command from the repository's root directory
docker-compose -f build.yml build
To run the solution, run the following command from the repository's root directory
docker-compose -f run.yml up
After a successful run you should see something like the following:
-
If the app is deployed on your machine: open your web browser and type the following:
localhost:4200
or127.0.0.1:4200
-
If the app is deployed on a different machine: open your web browser and type the following:
<machine_ip>:4200
Prepare your dataset for training
Specify the general parameters for you docker container
Specify the hyperparameters for the training job
Check your training logs to get better insights on the progress of the training
Monitor the training using Tensorboard
Check the status to know when the job is completed successfully
Download your mode and easily test it with the built in inference API using Swagger
Delete the container's job to stop an ongoing job or to remove the container of a finished job. (Finished jobs are always available to download)
Check our tips document to have (1) (a better insight on training models based on our expertise) and (2) (a benchmark of the inference speed).
Our tensorboard document helps you find your way more easily while navigating tensorboard
- In advanced configuration mode, be careful while making the changes because it can cause errors while training. If that happens, stop the job and try again.
-
In general settings, choose carefully the container name because choosing a name used by another container will cause errors.
-
When you leave tensorboard open for a long time, it might freeze. When encountered with such issue simply closing tensorboard tab in the browser and reopening it will solve the problem.
Joe Sleiman, inmind.ai, Beirut, Lebanon
Daniel Anani, inmind.ai, Beirut, Lebanon
Joe Abou Nakoul, inmind.ai, Beirut, Lebanon
Michael Ghosn, inmind.ai, Beirut, Lebanon
Elie Haddad, Beirut, Lebanon