!Paper! STS-TR: Synthetic Turkish Scene Text Recognition Dataset

Overview

The Synthetic Turkish Scene Text Recognition (STS-TR) dataset is a comprehensive synthetic dataset created to complement the real-world Turkish Scene Text Recognition (TS-TR) dataset, featuring over 12 million synthetic samples designed to simulate various textual scenarios. It includes a wide array of Turkish words and phrases rendered in diverse fonts, sizes, and orientations on generic background scenes with added realistic effects like shadows, blurs, and environmental distortions. This dataset enhances training data availability for models, particularly those focusing on the Turkish language.

Figure 1: MViT-TR architecture.

Installation

To set up MViT-TR for training and evaluation, follow these steps:

Clone the repository:

git clone https://github.com/serdaryildiz/STS-TR.git
cd STS-TR

Install required dependencies:
```
pip install -r requirements.txt
```

Adjust sources folder.

  .
  ├── background
  │   └── val2017
  ├── fonts
  ├── text
  └── textures
      └── dtd
          └── images

Run main.py
```
python main.py
```

Examples

Figure 2: Samples from STS-TR dataset.

Citation

If you find this work useful, please cite our paper:

@article{YILDIZ2024101881,
title = {Turkish scene text recognition: Introducing extensive real and synthetic datasets and a novel recognition model},
journal = {Engineering Science and Technology, an International Journal},
volume = {60},
pages = {101881},
year = {2024},
issn = {2215-0986},
doi = {https://doi.org/10.1016/j.jestch.2024.101881},
url = {https://www.sciencedirect.com/science/article/pii/S2215098624002672},
author = {Serdar Yıldız},
keywords = {Scene text recognition dataset, Synthetic scene text recognition dataset, Patch masking, Position attention, Vision transformers},
}

serdaryildiz/STS-TR

!Paper! STS-TR: Synthetic Turkish Scene Text Recognition Dataset

Overview

Installation

Examples

Citation

Download

Kaggle