PhD-Thesis

Topic : Deep Learning on Graph Structured Data: Algorithms and Applications

Abstract

Graph deep learning aims at expanding deep learning techniques to non-euclidean data structures, improving the generalizability of neural models for data with arbitrary structures. Graphs are highly generic data structures that can be used to represent both the data which are naturally formed in a network structure such as social networks or chemical compounds, as well as any scenario which could be modelled as a network such as a scene graph from an image or a knowledge graph from a natural language text. Therefore, it is important to improve algorithm designs not only to focus on catering natural graph structured data, but also to be flexible enough to incorporate application specific requirements. When designing algorithms for natural graph structured data, unsupervised graph representation learning plays a crucial role, specially when data annotation is expensive. Inspired by the real world graph generation processes where the graphs are formed based on one or more global factors which are common to all elements of the graph (e.g., topic of a discussion thread, solubility level of a molecule), we propose to extract graph-wise common latent factors filtering node-level specific factors as graph embeddings. We empirically demonstrate that, while extracting common latent factors is beneficial for graph level tasks to alleviate distractions caused by local variations of individual nodes or local neighbourhoods, it also benefits node level tasks by enabling long-range node dependencies specially for disassortative graphs.

Then, we move to an application of deep graph learning; Situation Recognition, a Vision and Language structured prediction task. Graph-encoding neural models are designed to address structure related requirements like neighbourhood information propagation. When those are applied directly to advanced reasoning tasks such as Situation Recognition without properly adapting to the domain of the application, they are unable to perform well due to their inability in complex multi-modal reasoning. We address this by proposing two novel approaches; a transfer learning based iterative mechanism and an inter-dependent latent query based reasoning mechanism. Unlike graph neural networks, our query based information propagation methods do not over emphasize on inter-node similarity in neighbourhoods, which causes bias towards frequent neighbour co-occurrence patterns ignoring rarely occurred but plausible scenarios.