Strikethrough Removal From Handwritten Words Using CycleGANs


Code and related resources for the ICDAR 2021 paper Strikethrough Removal From Handwritten Words Using CycleGANs

Table of Contents

  1. Code
    1. Strikethrough Removal
    2. Strikethrough Classification
    3. Strikethrough Identification
    4. Running the Code
  2. Data
  3. Citation
  4. Acknowledgements


Each of the following subdirectories contains the code that was used in the context of this paper. Additionally, Python requirements and the original configuration(s) are included for each. Configuration files have to be modified with local paths to input and output directories before running.

Model checkpoints are attached in the release of this repository.

Strikethrough Removal

  • code for training various forms of CycleGANs to remove strikethrough from handwritten words
  • the CycleGAN code is based on

Strikethrough Classification

  • code to train a DenseNet121 to classify a struck-through word image into one of seven types of strikethrough

Strikethrough Identification

  • code to train a DenseNet121 to identify whether a given word image is struck-through or not (i.e. 'clean')

Running the Code


In order to train any of the three models, run:

python src/ -configfile <path to config file> -config <name of section from config file>

If no configfile is defined, the script will assume config.cfg in the current working directory. If no config is defined, the script will assume DEFAULT.


For testing, run:

python src/ -configfile <path to config file> -data <path to data dir>
  • configfile should point to the config file in an output directory of a train run (or one of the checkpoint config files)
  • data should point to a directory containing struck and struck_gt sub-directories, e.g. one of the datasets presented in Data
  • an additional flag -save can be specified to save the cleaned images, otherwise only performance metrics (F1 score and RMSE) will be logged



  • R.Heil would like to thank Nicolas Pielawski, Håkan Wieslander, Johan Öfverstedt and Anders Brun for their helpful comments and fruitful discussions.
  • The computations were enabled by resources provided by the Swedish National Infrastructure for Computing (SNIC) at the High Performance Computing Center North (HPC2N) partially funded by the Swedish Research Council through grant agreement no. 2018-05973.
  • This work is partially supported by the Riksbankens Jubileumsfond (RJ) (Dnr P19-0103:1).