The GERALD dataset contains 5000 individual images and annotations for 33554 occurring objects. Our focus was to annotate occuring lightsignals, however, we decided to also include annotations for other occuring objects (mostly static signs) for more a comprehensive understanding of the enviroment. From the three existing signalling systems used in Germany we decided to only gather images from the H/V- and Ks-Signalling-System. The additional Hl-Signalling-System is only in use on some tracks in the territory of former East Germany and we only found a few available videos showing these signals. The signal aspects of the H/V- and Ks-System form the main classes of the dataset:
- H/V-Signalling-System: Hp 0 (HV), Hp 1, Hp 2, Vr 0, Vr 1, Vr 2
- Ks-Signalling-System: Hp 0 (Ks), Ks 1, Ks 2
The following table specifies how many instances of each main class were labelled:
Hp 0 (HV) | Hp 1 | Hp 2 | Vr 0 | Vr 1 | Vr 2 | Hp 0 (Ks) | Ks 1 | Ks 2 |
---|---|---|---|---|---|---|---|---|
1700 | 973 | 627 | 1422 | 1115 | 554 | 807 | 1182 | 761 |
18.6 % | 10.6 % | 6.9 % | 15.6 % | 12.2 % | 6.1 % | 8.8 % | 12.9 % | 8.3 % |
Nevertheless many more signal types were labelled to obtain a more complete dataset regarding German mainline railway signals and to enable detection of mast signs, hectometre signs etc. The following figure shows all classes and their corresponding amount of labelled instances.
For each image we also added information about the weather and light condition which distributes as follows:
Unknown | Sunny | Cloudy | Rainy | Snowy | Foggy |
---|---|---|---|---|---|
565 | 996 | 1925 | 1068 | 164 | 282 |
11.3 % | 19.9 % | 38.5 % | 21.4 % | 3.3 % | 5.6 % |
Unknown weather tag is used for pictures at night or in tunnels*
Daylight | Twilight | Dark |
---|---|---|
2969 | 1401 | 630 |
59.4 % | 28.0 % | 12.6 % |
This video exemplary shows a YOLOv4 based detector trained on the GERALD dataset.
An accompying publication can be found here: https://journals.sagepub.com/doi/10.1177/09544097231166472. The paper includes more information about autonomous driving in railways in general and additional statistics and a deeper analysis of the dataset. We also show some exemplary results based on a YOLOv4 network trained on GERALD.
For easy data handling and revision the annotations come in the PASCAL VOC format. This format consists of individual XML-files for every image containing all labelled instances and additional information like width and height of the image. All further information that does not comply with the PASCAL VOC format is saved in the info.json (e.g. weather, light, source url). The PASCAL VOC uses a "difficult" tag for each annotation. For this case the difficult tag was used to indicate if the signal was relevant to the train conductor in that situation
The images come in the .jpg format and are either 1280x720 or 1920x1080.
The individual frames were created from video recordings from cab view rides which have been uploaded to YouTube. We asked the uploaders for permission to use their video material for our dataset. Microsofts video annotation tool VoTT was used to find and annotate relevant frames, in a second the step the images and annotations were revised and checked with LabelImg
The images and annotations can be found here
pip install gerald-tools
-
Install git (Instructions)
-
Clone the repo
git clone git@github.com:ifs-rwth-aachen/GERALD.git
-
The main.py includes some example code to load the dataset and display one image including the annotations
If you want to contribute to the dataset or our research, please contact us. You can find our contact information further below.
We would like to thank all YouTubers by supporting us with their cab view recordings and kindly allow us to use their material. The following YouTubers contributed their material to our dataset:
The GERALD Dataset (Annotations for the respective images) by Philipp Leibner and Fabian Hampel is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
The gerald-tools provided in this repository are licensed under the Apache 2.0 License.
Philipp Leibner - philipp.leibner@ifs.rwth-aachen.de Fabian Hampel - fabian.hampel@ifs.rwth-aachen.de