/Similarity-Detection-using-Graph-SAGE-Python

Explained Graph Embedding generation and link prediction

Primary LanguageJupyter Notebook

Dataset :

This custom dataset is about the cricket players participated in the 2019 ICC Cricket World Cup .
151 players from 10 countries participated in the tournament. This custom dataset is made by the players' information provided at the news18 website.

Knowledge Graph Structure :

The Graph have a central node named (WC) for World cup.This node is linked to 10 nodes each representing one country and each country is linked to four nodes indicating the different types of players (Batsman,Bowler,All-rounder,Wicket-Keeper). All these four nodes are linked to their respective players.

graph illustration

We are going to predict the link between any two players by using Graph Embeddings.

Expected Results :

If we consider Player 1 = Virat Kohli.
Then its link with any batsman from India would be the highest.
The link between other players (All-rounders/Bowlers) in Indian team would be the next closest.
After that the players from different countries would be the least linked candidates.

Methodology :

For generating Node embeddings : Graph SAGE
For generating Graph : Steller Graph
For similarity comparison : Cosine Similarity

References :

Inductive Representation Learning on Large Graphs, Hamilton et al., NeurIPS 2017.