Multimodal (text + layout/format + image) fine-tuning toolkit for document understanding trained on XFUN.ja
Note: please install poppler
accordingly to your platform, dependency for pdf2image
conda create --prefix ./env python=3.8
conda activate ./env
pip install -r requirements.txt
python3 -m pip install 'git+https://github.com/facebookresearch/detectron2.git'
Pull existing image:
docker pull beomus/layoutxlm:latest
Build from source:
docker build -t layoutxlm .
-
On your host:
docker run -it --name inference --mount type=bind,source="$(pwd)"/infer,target=/infer -p 8888:8888 layoutxlm:latest
-
Inside your container:
jupyter lab --ip 0.0.0.0 --no-browser --allow-root
-
On your host:
localhost:8888/tree
The content of this project itself is licensed under the Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) Portions of the source code are based on the transformers project. Microsoft Open Source Code of Conduct
For help or issues using layoutlmft, please submit a GitHub issue.
For other communications related to layoutlmft, please contact Lei Cui (lecu@microsoft.com
), Furu Wei (fuwei@microsoft.com
).