Doc Matching - A Similar Document Template Matching ML Model to detect fraudulent documents for insurance claims
Pre - Smart India Hackathon '23 VJTI - Team TechnoSrats
Pre - Smart India Hackathon '23 VJTI - Team TechnoSrats
Table of Contents
Fraud transactions and invoices are serious problems in the financial services and insurance industries, KPMG reported over a billion dollars in losses due to fraudulent transactions. Thousands of man hours are lost each year to tedious manual checking of invoices and documents to confirm their validity. Extraction of standard information common to most insurance related documents is also required, with the advent of advanced computer vision and object detection models the automation of these odious tasks has become possible
The key features are:
|
- Problem Statement ID: SIH1441
- Problem Statement Title: Similar Document Template Matching Algorithm from Bajaj Finserv Health Ltd
- NextJS
- Material UI
- PostgreSQL (using Supabase)
- Hasura GraphQL API (over the Postgres DB)
- FastAPI (for the model)
- Tensorflow (for Deep-Learning based Bounding Box model)
- Scikit-Learn (for NLP-based Named Entity Recognition)
- Clone the GitHub repo
$ git clone https://github.com/saRvaGnyA/similar-doc-matching.git
- Enter the
client
directory. Install all the required dependencies. Ensure that remove any globally-installed packages like the React CLI, Tailwind CLI, PostCSS CLI or ESLint are uninstalled before proceeding ahead$ cd client $ yarn add
- Setup the
.env
file for storing the environment variables. A demo file for this is as follows:NEXT_PUBLIC_HASURA_ADMIN_SECRET = your hasura admin key NEXT_PUBLIC_SUPABASE_ANON_KEY = your supabase anon key NEXT_PUBLIC_SUPABASE_URL = your supabase public url
- If you are working on Visual Studio Code or WebStorm, it'd be convenient to install the extensions for Prettier and ESLint.
- Clone the GitHub repo
$ git clone https://github.com/saRvaGnyA/similar-doc-matching.git
- Create a virtual environment on the anaconda command prompt (Install conda if not installed) and then switch to that virtual environment. Lets say the name of the env is test.
$ conda create -n test python=3.8 anaconda $ conda activate test
- Look for requirments.txt and install the packages.
$ pip install -r requirements.txt
- Look for the
main.py
andutils.py
files and have them ready. (The packages for FastAPI would already be installed when you run command number 3 in the above section)
Once the required setup and installation is completed, you can start developing and running the project.
- Go to the
frontend
directory and run thedev
script to activate the development serverBefore pushing any commit, make sure to run the$ npm run dev
lint
script and fix any linting errorsIf you get an ESLint, Tailwind or PostCSS version conflict error, make a$ npm run lint
.env
file in theclient
directory with the following contents:SKIP_PREFLIGHT_CHECK = true
-
Locate to the
Model
directory. The models for the project are ingesture_model.tflite
file. -
Open the command prompt for anaconda and switch to the virtual environment that you created. (example: test)
$ conda activate test
-
To initiate the server, type the following in the command prompt
$ python main.py