/GERALD

Dataset for German Railway Signals

Primary LanguagePythonApache License 2.0Apache-2.0

Logo
“Ks-Vorsignal (Ks 1)”, by "Markus5linger", licensed under CC BY 4.0

The GERALD Dataset

German Railway Lightsignal Dataset

Table of Contents

About The Project

Screenshot

General Information

The GERALD dataset contains 5000 individual images and annotations for 33554 occurring objects. Our focus was to annotate occuring lightsignals, however, we decided to also include annotations for other occuring objects (mostly static signs) for more a comprehensive understanding of the enviroment. From the three existing signalling systems used in Germany we decided to only gather images from the H/V- and Ks-Signalling-System. The additional Hl-Signalling-System is only in use on some tracks in the territory of former East Germany and we only found a few available videos showing these signals. The signal aspects of the H/V- and Ks-System form the main classes of the dataset:

  • H/V-Signalling-System: Hp 0 (HV), Hp 1, Hp 2, Vr 0, Vr 1, Vr 2
  • Ks-Signalling-System: Hp 0 (Ks), Ks 1, Ks 2

The following table specifies how many instances of each main class were labelled:

Hp 0 (HV) Hp 1 Hp 2 Vr 0 Vr 1 Vr 2 Hp 0 (Ks) Ks 1 Ks 2
1700 973 627 1422 1115 554 807 1182 761
18.6 % 10.6 % 6.9 % 15.6 % 12.2 % 6.1 % 8.8 % 12.9 % 8.3 %

Nevertheless many more signal types were labelled to obtain a more complete dataset regarding German mainline railway signals and to enable detection of mast signs, hectometre signs etc. The following figure shows all classes and their corresponding amount of labelled instances.

For each image we also added information about the weather and light condition which distributes as follows:

Unknown Sunny Cloudy Rainy Snowy Foggy
565 996 1925 1068 164 282
11.3 % 19.9 % 38.5 % 21.4 % 3.3 % 5.6 %

Unknown weather tag is used for pictures at night or in tunnels*

Daylight Twilight Dark
2969 1401 630
59.4 % 28.0 % 12.6 %

Example Video

Watch the video

This video exemplary shows a YOLOv4 based detector trained on the GERALD dataset.

Research Paper

An accompying publication can be found here: https://journals.sagepub.com/doi/10.1177/09544097231166472. The paper includes more information about autonomous driving in railways in general and additional statistics and a deeper analysis of the dataset. We also show some exemplary results based on a YOLOv4 network trained on GERALD.

Data format

For easy data handling and revision the annotations come in the PASCAL VOC format. This format consists of individual XML-files for every image containing all labelled instances and additional information like width and height of the image. All further information that does not comply with the PASCAL VOC format is saved in the info.json (e.g. weather, light, source url). The PASCAL VOC uses a "difficult" tag for each annotation. For this case the difficult tag was used to indicate if the signal was relevant to the train conductor in that situation

The images come in the .jpg format and are either 1280x720 or 1920x1080.

How the data was gathered

The individual frames were created from video recordings from cab view rides which have been uploaded to YouTube. We asked the uploaders for permission to use their video material for our dataset. Microsofts video annotation tool VoTT was used to find and annotate relevant frames, in a second the step the images and annotations were revised and checked with LabelImg

Getting Started

Download the Dataset

The images and annotations can be found here

Install the python support library

Install via pip
pip install gerald-tools
Install using this repo
  1. Install git (Instructions)

  2. Clone the repo

    git clone git@github.com:ifs-rwth-aachen/GERALD.git
  3. The main.py includes some example code to load the dataset and display one image including the annotations

Contributing

If you want to contribute to the dataset or our research, please contact us. You can find our contact information further below.

Contributors

We would like to thank all YouTubers by supporting us with their cab view recordings and kindly allow us to use their material. The following YouTubers contributed their material to our dataset:

License

Creative Commons License The GERALD Dataset (Annotations for the respective images) by Philipp Leibner and Fabian Hampel is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

The gerald-tools provided in this repository are licensed under the Apache 2.0 License.

Contact

Philipp Leibner - philipp.leibner@ifs.rwth-aachen.de Fabian Hampel - fabian.hampel@ifs.rwth-aachen.de

IFS Logo

Related Dataset

  • RailSem19 (For general semantic scene understanding of railway related scenes)
  • FRSign (Dataset for French railway signals)
  • COCO (Includes bounding boxes for trains, cars and traffic lights (treats railway signals as traffic lights))