This repository contains the code for the document classification task using LayoutLMv3
The dataset used for this task is the Document Classification Dataset from Kaggle.
1 - Install Docker optional (if you want to run the docker image)
2 - Install tesseract
1 - Install dependencies
pip install -r requirements.txt
2 - Run all cells in the notebook layoutlmv3_training_inference_notebook.ipynb
1 - Build the docker image
docker build -t layoutlmv3 .
2 - Run the docker image
docker run -e IMAGE_NAME=(filename) -v (path of dir):/images_dir layoutlmv3
1 - Pull the docker image
docker pull ghcr.io/faizankarim/dl_assignment_px_faizankarim
2 - Run the docker image
docker run -e IMAGE_NAME=(filename) -v (path of dir):/images_dir ghcr.io/faizankarim/dl_assignment_px_faizankarim