GERALD: A Python repository from Chair and Institute for Rail Vehicles and Transport Systems - Chair and Institute for Rail Vehicles and Transport Systems

“Ks-Vorsignal (Ks 1)”, by "Markus5linger", licensed under CC BY 4.0

The GERALD Dataset

German Railway Lightsignal Dataset

About the Project
Getting Started
- Download the Dataset
- Install the python support library
Usage
Contributing
Contributors
License
Contact
Related datasets

About The Project

General Information

The GERALD dataset contains 5000 individual images and annotations for 33554 occurring objects. Our focus was to annotate occuring lightsignals, however, we decided to also include annotations for other occuring objects (mostly static signs) for more a comprehensive understanding of the enviroment. From the three existing signalling systems used in Germany we decided to only gather images from the H/V- and Ks-Signalling-System. The additional Hl-Signalling-System is only in use on some tracks in the territory of former East Germany and we only found a few available videos showing these signals. The signal aspects of the H/V- and Ks-System form the main classes of the dataset:

H/V-Signalling-System: Hp 0 (HV), Hp 1, Hp 2, Vr 0, Vr 1, Vr 2
Ks-Signalling-System: Hp 0 (Ks), Ks 1, Ks 2

The following table specifies how many instances of each main class were labelled:

Hp 0 (HV)	Hp 1	Hp 2	Vr 0	Vr 1	Vr 2	Hp 0 (Ks)	Ks 1	Ks 2
1700	973	627	1422	1115	554	807	1182	761
18.6 %	10.6 %	6.9 %	15.6 %	12.2 %	6.1 %	8.8 %	12.9 %	8.3 %

Nevertheless many more signal types were labelled to obtain a more complete dataset regarding German mainline railway signals and to enable detection of mast signs, hectometre signs etc. The following figure shows all classes and their corresponding amount of labelled instances.

For each image we also added information about the weather and light condition which distributes as follows:

Unknown	Sunny	Cloudy	Rainy	Snowy	Foggy
565	996	1925	1068	164	282
11.3 %	19.9 %	38.5 %	21.4 %	3.3 %	5.6 %

_{^{Unknown weather tag is used for pictures at night or in tunnels*}}

Daylight	Twilight	Dark
2969	1401	630
59.4 %	28.0 %	12.6 %

Example Video

This video exemplary shows a YOLOv4 based detector trained on the GERALD dataset.

Research Paper

An accompying publication can be found here: https://journals.sagepub.com/doi/10.1177/09544097231166472. The paper includes more information about autonomous driving in railways in general and additional statistics and a deeper analysis of the dataset. We also show some exemplary results based on a YOLOv4 network trained on GERALD.

Data format

For easy data handling and revision the annotations come in the PASCAL VOC format. This format consists of individual XML-files for every image containing all labelled instances and additional information like width and height of the image. All further information that does not comply with the PASCAL VOC format is saved in the info.json (e.g. weather, light, source url). The PASCAL VOC uses a "difficult" tag for each annotation. For this case the difficult tag was used to indicate if the signal was relevant to the train conductor in that situation

The images come in the .jpg format and are either 1280x720 or 1920x1080.

How the data was gathered

The individual frames were created from video recordings from cab view rides which have been uploaded to YouTube. We asked the uploaders for permission to use their video material for our dataset. Microsofts video annotation tool VoTT was used to find and annotate relevant frames, in a second the step the images and annotations were revised and checked with LabelImg

Getting Started

Download the Dataset

The images and annotations can be found here

Install the python support library

Install via pip

pip install gerald-tools

Install using this repo

Install git (Instructions)

Clone the repo

git clone git@github.com:ifs-rwth-aachen/GERALD.git

The main.py includes some example code to load the dataset and display one image including the annotations

Contributing

If you want to contribute to the dataset or our research, please contact us. You can find our contact information further below.

Contributors

We would like to thank all YouTubers by supporting us with their cab view recordings and kindly allow us to use their material. The following YouTubers contributed their material to our dataset:

License

The GERALD Dataset (Annotations for the respective images) by Philipp Leibner and Fabian Hampel is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

The gerald-tools provided in this repository are licensed under the Apache 2.0 License.

Contact

Philipp Leibner - philipp.leibner@ifs.rwth-aachen.de Fabian Hampel - fabian.hampel@ifs.rwth-aachen.de

Related Dataset

RailSem19 (For general semantic scene understanding of railway related scenes)
FRSign (Dataset for French railway signals)
COCO (Includes bounding boxes for trains, cars and traffic lights (treats railway signals as traffic lights))

ifs-rwth-aachen/GERALD