/SyntheticHTR

SyntheticHTR: Handwritten Text Image Synthesis based on Latent Diffusion Models

Primary LanguagePythonMIT LicenseMIT

SyntheticHTR

You can find the official report here

Synthetic image generation for HTR using PyTorch based on WordStylist: Styled Verbatim Handwritten Text Generation with Latent Diffusion Models [1]

Dependencies

Create a new virtual environment and install all the necessary Python packages:

python3 -m venv SyntheticHTR-env
source SyntheticHTR-env/bin/activate
pip install --upgrade pip
python3 -m pip install -r SyntheticHTR/requirements.txt

Content

Our pre-trained models

Download our pre-trained models from here. There are 3 models in total, corresponding to the dataset that they have been trained on.

Our synthesized datasets

As part of our research, we have fully synthesized four datasets, and further refined these by keeping only the highest quality synthetic images.

You can download our best synthesized datasets: IAM dataset from here, George Washington dataset from here, IMGUR5k dataset from here and our Out-Of-Vocabulary IAM dataset from here

Use the model for sampling or fine-tuning on datasets

Download a dataset of your choice with the word-level images, then run python3 datasets/process_dataset.py to preprocess the data before python3 model/train.py to train the model on your dataset. Lastly, you may want to fully synthesize the dataset using python3 sampling/full_sampling.py.

References

[1]: Nikolaidou, K., Retsinas, G., Christlein, V., Seuret, M., Sfikas, G., Smith, E. B., Mokayed, H., & Liwicki, M. (2023). WordStylist: Styled Verbatim Handwritten Text Generation with Latent Diffusion Models. arXiv preprint arXiv:2303.16576. https://arxiv.org/abs/2303.16576