A deep learning framework for historical document image analysis.
Install dependencies
# clone project
git clone https://github.com/DIVA-DIA/unsupervised_learning.git
cd unsupervised_learing
# create conda environment (IMPORTANT: needs Python 3.8+)
conda env create -f conda_env_gpu.yaml
# activate the environment using .autoenv
source .autoenv
# install requirements
pip install -r requirements.txt
Train model with default configuration.
Care: you need to change the value of data_dir
in config/datamodule/cb55_10_cropped_datamodule.yaml
.
# default run based on config/config.yaml
python run.py
# train on CPU
python run.py trainer.gpus=0
# train on GPU
python run.py trainer.gpus=1
Train using GPU
# [default] train on all available GPUs
python run.py trainer.gpus=-1
# train on one GPU
python run.py trainer.gpus=1
# train on two GPUs
python run.py trainer.gpus=2
# train on CPU
python run.py trainer.accelerator=ddp_cpu
Train using CPU for debugging
# train on CPU
python run.py trainer.accelerator=ddp_cpu trainer.precision=32
Train model with chosen experiment configuration from configs/experiment/
python run.py experiment=experiment_name
You can override any parameter from command line like this
python run.py trainer.max_epochs=20 datamodule.batch_size=64
- Fork this repo
- Clone the repo to your local filesystem (
git clone CLONELINK
) - Clone the repo onto your remote machine
- Move into the folder on your remote machine and create the conda environment (conda env create -f conda_env_gpu.yaml)
- Run
source .autoenv
in the root folder on your remote machine (activates the environment) - Open the folder in PyCharm (File -> open)
- Add the interpreter (Preferences -> Project -> Python interpreter -> top left gear icon -> add... -> SSH Interpreter) follow the instructions (set the correct mapping to enable deployment)
- Upload the files (deployment)
- Create a wandb account (wandb.ai)
- Log via ssh onto your remote machine
- Go to the root folder of the framework and activate the environment (source .autoenv OR conda activate unsupervised_learning)
- Log into wandb. Execute
wandb login
and follow the instructions - Now you should be able to run the basic experiment from PyCharm
You can load the different model parts backbone
or header
as well as the whole task.
To load the backbone
or the header
you need to add to your experiment config the field path_to_weights
.
e.g.
model:
header:
path_to_weights: /my/path/to/the/pth/file
To load the whole task you need to provide the path to the whole task to the trainer. This is with the field resume_from_checkpoint
.
e.g.
trainer:
resume_from_checkpoint: /path/to/.ckpt/file
You can freeze both parts of the model (backbone or header) with the freeze
flag in the config.
E.g. you want to freeze the backbone:
In the command line:
python run.py +model.backbone.freeze=True
In the config (e.g. model/backbone/baby_unet.yaml):
...
freeze: True
...
CARE: You can not train a model when you do not have trainable parameters (e.g. freezing backbone and header).
If you use the selection
key you can either use an int, which takes the first n files, or a list of strings to filter the different datasets.
In the case you are using a full-page dataset be aware that the selection list is a list of file names without the extension.
@inproceedings{vogtlin2023DIVADAFDeepLearning,
author = {Lars V{\"{o}}gtlin and
Anna Scius{-}Bertrand and
Paul Maergner and
Andreas Fischer and
Rolf Ingold},
title = {{DIVA-DAF:} {A} Deep Learning Framework for Historical Document Image
Analysis},
booktitle = {Proceedings of the 7th International Workshop on Historical Document
Imaging and Processing, HIP@ICDAR 2023, San Jose, CA, USA, August
25-26, 2023},
pages = {61--66},
publisher = {{ACM}},
year = {2023},
url = {https://doi.org/10.1145/3604951.3605511},
doi = {10.1145/3604951.3605511}
}