Step 0: Create A Secrets File
Create an empty text file, name it secrets.txt
, then copy and paste the template below and save it.
LS_HOST=
TOKEN=
S3_BUCKET=
S3_REGION=us-east-1
S3_ACCESS_KEY=
S3_SECRET_KEY=
S3_ENDPOINT=https://s3.wasabisys.com
DB_CONNECTION_STRING=
DB_NAME=label_studio
This file is very important, and you will be using it a lot, so keep it somewhere safe.
Step 1: Deploy Label-Studio
- Sign up on Heroku: https://signup.heroku.com/ and verify your email.
- Click this button: .
- Pick any name for the app (e.g.,
label-studio-0
). - Change
DISABLE_SIGNUP_WITHOUT_LINK
from0
to1
. ForUSERNAME
, type your email address. - Enter a username and password for the default account that you will use for label-studio.
- Click
deploy
!
When your label-studio
app is deployed, you will see something that like this:
- Click on
View
, then login with your username and password. Copy the base URL of your home page and use it forLS_HOST
in yoursecrets.txt
file. For example: https://my-labelstudio.herokuapp.com – make sure you includehttps://
, and don't include anything after.com
.
LS_HOST=https://replace-me-with-your-app-url.herokuapp.com
...
- Click on your initials icon (top right) -> Account & Settings, then copy the value of
Access Token
. Add the token value to theTOKEN
variable in yoursecrets.txt
file. For example:
...
TOKEN=SoMe-sUpEr-sEcReT-LaBeL-StUdIo-tOkEn
...
Step 2: Create a cloud storage
You would wanna keep your data in the cloud so it's easily moved between different apps. For that, you will create a Simple Storage Service (S3) bucket, where all your data will be kept. There are a lot of options for S3-compatible object storage service, but here, we will use Wasabi.
- Sign up on Wasabi: https://wasabi.com/sign-up/ and verify your email.
- Open the console: https://console.wasabisys.com, then click on
Buckets
->CREATE BUCKET
.
- Pick up a name for your bucket (e.g.,
data-0
), select a region (e.g.,us-east-1
), then clickNext
. Copy the value of the bucket and region name to yoursecrets.txt
file. For example:
...
S3_BUCKET=My-bUcKeT-NaMe
S3_REGION=us-east-1
...
- Keep the options as they are, then click
Next
, then clickCREATE BUCKET
.
That's it! You made an S3 bucket! Now generate access id and secret to connect to the bucket.
- Click on the key icon on the left bar -> click on
CREATE NEW ACCESS KEY
. Leave the default selection as it is, then clickcreate
.
- Copy the keys to clipboard, then use the value of the access key id for
S3_ACCESS_KEY
and the secret key forS3_SECRET_KEY
in yoursecrets.txt
file (don't paste the name of the variables that were automatically copied, just their values). For example:
...
S3_ACCESS_KEY=mY-s3-BuCkEt-aCcEsS-KeY
S3_SECRET_KEY=mY-sUpEr-sEcReT-S3-bUcKeT-SeCrEt-kEy
...
- Visit this page and copy the Service URL that correspond to your bucket's region. Then use it's value for
S3_ENDPOINT
(e.g.,: the endpoint forus-east-1
ishttps://s3.wasabisys.com
). Removehttps://
from the URL. For example:
...
S3_ENDPOINT=s3.wasabisys.com
...
Step 3: Create a backend database
This database will be used to store all the information related to the data generated by the project. We will use MongoDB as a backend database. You can create one for free with plenty of storage for what we need.
- Sign up on MongoDB Atlast: https://www.mongodb.com/cloud/atlas/register and verify your email.
- Visit https://cloud.mongodb.com and click on
Build a Database
-> selectShared
->Create
. - Keep the options as they are. You can change the cluster name if you want, or just keep the default name.
- Click
Create Cluster
, then pick a username and a strong password, then clickCreate user
. - For the
IP Address
, enter0.0.0.0/0
->Add Entry
. Then, clickFinish and close
. - Click
Go to database
-> clickConnect
. SelectPython
for the driver and3.6 or later
for version.
- Copy the connection string and replace
<password>
with your database password, and paste it as a value forDB_CONNECTION_STRING
in yoursecrets.txt
file, and leave the database name as it is. For example:
...
DB_CONNECTION_STRING=mongodb+srv://server.mongodb.net/myFirstDatabase?retryWrites=true&w=majority
DB_NAME=label_studio
...
Step 4: Connecting the cloud storage to label-studio
- Go back to your label-studio application. Click on
Create project
. - Pick a name for your project, then click on
labeling setup
and selectobject detection with bounding boxes
. - Remove the two default labels, then add the labels that you expect to see in your dataset (you can edit this later to add more). Make sure to add one label per line (note: the label should not include a backslash
\
!). Click onAdd
, thensave
. - Go the project settings (top right) -> click on
cloud storage
->add source storage
. - For
Bucket Name
,Region Name
,Access Key ID
, andSecret Key
fields, use the values ofS3_BUCKET
,S3_REGION
,S3_ENDPOINT
,S3_ACCESS_KEY
,S3_SECRET_KEY
from yoursecrets.txt
file, respectively. - For
S3 endpoint
useS3_ENDPOINT
value from yoursecrets.txt
file, but appendhttps://
at the start of the URL. - Clear the
Session token
field and leave it empty. ToggleTreat every bucket object as a source file
andRecursive scan
to turn them ON, set the expiry time to 120, then clickAdd storage
.
Now anything you upload the bucket can be synced to label studio!
The upload interface:
After upload, sync:
Now if you upload any image to your bucket and sync the storage, you will be able see the images you uploaded as tasks in your label-studio project. You can label the objects in the image by opening a task -> clicking the label, then drawing a bounding box around the object -> submit.
Step 5: Setting up the model prediction-workflow
- If you don't have an account yet, sign up on GitHub here and verify your email.
- First, fork the model repository by clicking on this button: , then click
Create fork
. - In your fork page (the page URL will have
YOUR_USERNAME/BirdFSD-YOLOv5
; this is your fork page). Click on your forked repositorysettings
->Secrets
.
- For every variable in your
secrets.txt
file, copy and paste the name of the variable (before=
) to theName
field and use the value that correspond to that name (after=
) for the secret'sValue
field. Repeat this for every secret in yoursecrets.txt
file.
- Enable workflows in your fork:
- Then, click on and enable all the workflows that are highlighted wuth a red square in the image below:
Step 6: Training the model
- You can train your model when you have "enough" annotations. The number of annotations required for a reliable model will differ based on your use case and the size of your dataset (you can learn more here). Train your model every now and then after you annotate a sizeable chunk of your data (e.g., after every 100, 500 or 1000 new annotation). The more annotated data you add, the better the model will learn.
-
Sign up to W&B to track your training (optional, but recommended).
-
Log in to your Google account, then click on this button to open the training notebook:
-
Click on
Copy to Drive
.
- Click on the folder icon, then Drag and drop
secrets.txt
to the files section in Google Colab.
- Click right on the file ->
Rename file
, then rename it to.env
(don't worry if you can't see the file after renaming it, it just became a hidden file).
- Follow the instructions in the notebook to train the model.