Arxiv file is available: https://arxiv.org/abs/2210.15366
The ERGL is uploading....
The top 25 events simply depend on the entire dataset and are not specifically selected for each single target scene. So for each scene graph, some events seem to be a little bit strange in the graph tree. At the semantic level, these 25 classes of events are slightly insufficient in describing 10 different classes of scenes. The top 25 events are automatically chosen by the classification model without involving artificial prior knowledge.