You can find the official report here
Synthetic image generation for HTR using PyTorch based on WordStylist: Styled Verbatim Handwritten Text Generation with Latent Diffusion Models [1]
Create a new virtual environment and install all the necessary Python packages:
python3 -m venv SyntheticHTR-env
source SyntheticHTR-env/bin/activate
pip install --upgrade pip
python3 -m pip install -r SyntheticHTR/requirements.txt
- Download our pre-trained models.
- Download the synthetic datasets.
- Use the pre-trained models for sampling data or fine-tuning on additional datasets.
Download our pre-trained models from here. There are 3 models in total, corresponding to the dataset that they have been trained on.
As part of our research, we have fully synthesized four datasets, and further refined these by keeping only the highest quality synthetic images.
You can download our best synthesized datasets: IAM dataset from here, George Washington dataset from here, IMGUR5k dataset from here and our Out-Of-Vocabulary IAM dataset from here
Download a dataset of your choice with the word-level images, then run python3 datasets/process_dataset.py
to preprocess the data before python3 model/train.py
to train the model on your dataset. Lastly, you may want to fully synthesize the dataset using python3 sampling/full_sampling.py
.
[1]: Nikolaidou, K., Retsinas, G., Christlein, V., Seuret, M., Sfikas, G., Smith, E. B., Mokayed, H., & Liwicki, M. (2023). WordStylist: Styled Verbatim Handwritten Text Generation with Latent Diffusion Models. arXiv preprint arXiv:2303.16576. https://arxiv.org/abs/2303.16576