/Emgraph

A Python library for knowledge graph representation learning (graph embedding).

Primary LanguagePythonOtherNOASSERTION

Emgraph

Emgraph (Embedding graphs) is a Python library for graph representation learning.

It provides a simple API for design, train, and evaluate graph embedding models. You can use the base models to easily develop your own model.

PyPI - Package version PyPI - Python Version PyPI - Downloads PyPI - Implementation GitHub last commit GitHub Repo stars PyPI - Maintenance PyPI - License PyPI - Format Status Commits Commits

Installation

Install the latest version of Emgraph:

$ pip install emgraph

Quick start

Embedding wordnet11 graph using TransE model:

from sklearn.metrics import brier_score_loss, log_loss
from scipy.special import expit
from emgraph.datasets import BaseDataset, DatasetType
from emgraph.models import TransE


def train_transe(data):
    
    model = TransE(batches_count=64, seed=0, epochs=20, k=100, eta=20,
                   optimizer='adam', optimizer_params={'lr': 0.0001},
                   loss='pairwise', verbose=True, large_graphs=False)
    model.fit(data['train'])
    scores = model.predict(data['test'])
    return scores
    

if __name__ == '__main__':
    
    wn11_dataset = BaseDataset.load_dataset(DatasetType.WN11)
    
    scores = train_transe(data=wn11_dataset)
    print("Scores: ", scores)
    print("Brier score loss:", brier_score_loss(wn11_dataset['test_labels'], expit(scores)))

Evaluating ComplEx model after training:

import numpy as np
from emgraph.datasets import BaseDataset, DatasetType
from emgraph.models import ComplEx
from emgraph.evaluation import evaluate_performance



def complex_performance(data):
    
    model = ComplEx(batches_count=10, seed=0, epochs=20, k=150, eta=1,
                    loss='nll', optimizer='adam')
    model.fit(np.concatenate((data['train'], data['valid'])))
    filter_triples = np.concatenate((data['train'], data['valid'], data['test']))
    ranks = evaluate_performance(data['test'][:5], model=model,
                                 filter_triples=filter_triples,
                                 corrupt_side='s+o',
                                 use_default_protocol=False)
    return ranks


if __name__ == '__main__':

    wn18_dataset = BaseDataset.load_dataset(DatasetType.WN18)   
    ranks = complex_performance(data=wn18_dataset)
    print("ranks {}".format(ranks))

More examples

Embedding wordnet11 graph using DistMult model:

from sklearn.metrics import brier_score_loss, log_loss
from scipy.special import expit
from emgraph.datasets import BaseDataset, DatasetType
from emgraph.models import DistMult


def train_dist_mult(data):

    model = DistMult(batches_count=1, seed=555, epochs=20, k=10, loss='pairwise',
                     loss_params={'margin': 5})
    model.fit(data['train'])
    scores = model.predict(data['test'])
    
    return scores
    

if __name__ == '__main__':
    
    wn11_dataset = BaseDataset.load_dataset(DatasetType.WN11)
    
    scores = train_dist_mult(data=wn11_dataset)
    print("Scores: ", scores)
    print("Brier score loss:", brier_score_loss(wn11_dataset['test_labels'], expit(scores)))
Algorithms table
Model Reference
1 TransE Translating Embeddings for Modeling Multi-relational Data
2 ComplEx Complex Embeddings for Simple Link Prediction
3 HolE Holographic Embeddings of Knowledge Graphs
4 DistMult Embedding Entities and Relations for Learning and Inference in Knowledge Bases
5 ConvE Convolutional 2D Knowledge Graph Embeddings
6 ConvKB A Novel Embedding Model for Knowledge Base Completion Based on Convolutional Neural Network

Call for Contributions

The Emgraph project welcomes your expertise and enthusiasm!

Ways to contribute to Emgraph:

  • Writing code
  • Review pull requests
  • Develop tutorials, presentations, and other educational materials
  • Translate documentation and readme contents

Issues

If you happened to encounter any issue in the codes, please report it here. A better way is to fork the repository on Github and/or create a pull request.

Features

  • Support CPU/GPU
  • Vectorized operations
  • Preprocessors
  • Dataset loader
  • Standard API
  • Documentation
  • Test driven development

If you find this project helpful, please consider giving it a .

License

Released under the BSD license

Credit

This repository is a transformation of the AmpliGraph library for TensorFlow 2, with a modular architecture implementation. It also draws inspiration from PyKEEN and Spectral. Credit is extended to these exceptional projects.

Contact

Copyright © 2019-2024 Emgraph Developers
Soran Ghaderi (soran.gdr.cs@gmail.com)   follow me Github Twitter Linkedin
Taleb Zarhesh (taleb.zarhesh@gmail.com)  follow me Github Twitter Linkedin