/nanonets-id-card-digitization

Python demo for ID card digitization using Nanonets

Primary LanguagePython

NanoNets ID Card Digitization Sample

NanoNets ID Card Digitization Sample


Extracting Information From ID Cards

Annotations include bounding boxes for each image and have the same name as the image name. You can find the example to train a model in python, by updating the api-key and model id in corresponding file. There is also a pre-processed json annotations folder that are ready payload for nanonets API.


Build an ID Card Info Extraction Model

Note: Make sure you have python and pip installed on your system if you don't visit Python, pip

id-card-digitization-gif

Step 1: Clone the Repo, Install dependencies

git clone https://github.com/NanoNets/nanonets-id-card-digitization.git
cd nanonets-id-card-digitization
sudo pip install nanonets

Step 2: Get your free API Key

Get your free API Key from http://app.nanonets.com/#/keys

Step 3: Set the API key as an Environment Variable

export NANONETS_API_KEY=YOUR_API_KEY_GOES_HERE

Step 4: Upload Images For Training

The training data is found in images (image files) and annotations (annotations for the image files)

python ./code/training.py

_Note: This generates a MODEL_ID that you need for the next step

Step 5: Add Model Id as Environment Variable

export NANONETS_MODEL_ID=YOUR_MODEL_ID

_Note: you will get YOUR_MODEL_ID from the previous step

Step 6: Get Model State

The model takes ~2 hours to train. You will get an email once the model is trained. In the meanwhile you check the state of the model

python ./code/model-state.py

Step 7: Make Prediction

Once the model is trained. You can make predictions using the model

python ./code/prediction.py PATH_TO_YOUR_IMAGE.jpg

Sample Usage:

python ./code/prediction.py ./images/111.jpg

Note the python sample uses the converted json instead of the xml payload for convenience purposes, hence it has no dependencies.