A knowledge graph model that can genaralize to unseen relationships without any training. Makes use of RMPI graph network along with BERT to learn embeddings for relationships and attempts to generalize from just training on the KG once
Download the required compressed datasets into the data
folder:
Download link | Size (compressed) |
---|---|
UMLS (small graph for tests) | 121 KB |
WN18RR | 6.6 MB |
FB15k-237 | 21 MB |
Wikidata5M | 1.4 GB |
GloVe embeddings | 423 MB |
DBpedia-Entity | 1.3 GB |
Then use tar
to extract the files, e.g.
tar -xzvf WN18RR.tar.gz
Overlap Coefficient -
Jaccard Coeffecient -
CompNet - https://web.rniapps.net/compnet/ to visualize the graph
Similarity with Textual Embeddings
Cosine-Similarity -
L2 Distance -
- Sentence Embedding -> Kmeans (k=3)
Number of Triples in Train = 82436
Number of Unique Entities(HEAD, TAIL) in Train = 7644
Number of Unique Relationship in Train = 55
Number of Triples in Test = 200600
Number of Unique Entities(HEAD, TAIL) in Test = 12969
Number of Unique Relationship in Train = 132
Number of Triples in Validation = 27080
Number of Unique Entities(HEAD, TAIL) in Validation = 5961
Number of Unique Relationship in Train = 50
- TFIDVectorizer -> Kmeans(K=3)
Number of Triples in Train = 67039
Number of Unique Entities(HEAD, TAIL) in Train = 5873
Number of Unique Relationship in Train = 38
Number of Triples in Test = 42614
Number of Unique Entities(HEAD, TAIL) in Test = 6489
Number of Unique Relationship in Train = 20
Number of Triples in Validation = 200463
Number of Unique Entities(HEAD, TAIL) in Validation = 13970
Number of Unique Relationship in Train = 179
- TFIDVectorizer -> Kmeans(K=3)
Number of Triples in Train = 24722
Number of Unique Entities(HEAD, TAIL) in Train = 6600
Number of Unique Relationship in Train = 132
Number of Triples in Test = 3
Number of Unique Entities(HEAD, TAIL) in Test = 5
Number of Unique Relationship in Train = 1
Number of Triples in Validation = 18
Number of Unique Entities(HEAD, TAIL) in Validation = 13
Number of Unique Relationship in Train = 1
- Sentence Embedding -> Kmeans(K=3)
Number of Triples in Train = 13032
Number of Unique Entities(HEAD, TAIL) in Train = 4962
Number of Unique Relationship in Train = 79
Number of Triples in Test = 8515
Number of Unique Entities(HEAD, TAIL) in Test = 1336
Number of Unique Relationship in Train = 25
Number of Triples in Validation = 3196
Number of Unique Entities(HEAD, TAIL) in Validation = 1196
Number of Unique Relationship in Train = 30