Knowledge Graph Question-Answering

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

Question Answering over Multiple and Heterogeneous Knowledge Bases

                                        Docker Python Task License

The MuHeQA (Multiple and Heterogeneous Question-Answering) system creates natural language answers from natural language questions using knowledge base from both structured (KG) and unstructured (documents) data sources.

  title={MuHeQA: Question Answering over Multiple and Heterogeneous Knowledge Bases},


  1. Prepare a Python 3 environment with Conda installed
  2. Clone this repo
    git clone https://github.com/librairy/MuHeQA.git
  3. Move into the root directory.
      cd MuHeQA
  4. Download the RDF Verbalizer model into the application/summary/kg/nlg/model folder
  5. Download the answer classifier and unzip into the root project directory. The folder resources_dir/ is created.
  6. Create a new conda environment from the environment.yml file:
    conda env create -f environment.yml
  7. Activate the environment: conda activate .muheqa
  8. In case you have a device based on Apple's M1 chip skip to M1 Environment step
  9. Install the dependencies: pip install -r requirements.txt

M1 Environments (only for Apple's M1 devices )

  1. Install the Apple edition of tensorflow
    pip install --upgrade --force --no-dependencies tensorflow-macos
    pip install --upgrade --force --no-dependencies tensorflow-metal
  2. Install the following libraries:
    pip install Flask==1.1.4
    pip install Flask-Cors==3.0.10
    pip install Flask-Script==2.0.6
    pip install spacy==3.0.6
    pip install spacy-dbpedia-spotlight==0.2.1
    pip install spacy-entity-linker==1.0.1
    pip install spacy-legacy==3.0.8
  3. Compile and install the tokenizers module from Huggingface:
    curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
    cd /Users/cbadenes/Projects
    git clone https://github.com/huggingface/tokenizers
    cd tokenizers/bindings/python
    pip install setuptools_rust
    python setup.py install
  4. Compile and install the transformers module from Huggingface:
    pip install git+https://github.com/huggingface/transformers
  5. And finally, install torch and keras
    pip install flatbuffers
    pip install keras==2.6.0
    	pip install torch

Service start-up

  1. Once the environment is ready, just execute the following command (runserver for development mode and runprodserver for production mode ):

    python manage.py runserver
  2. It may take some minutes to load some external resources. The following logs will appear when everything is ready:

    Loading RDF2nlg model: /Users/cbadenes/Projects/muheqa/application/summary/kg/nlg/model ..
    model ready
    Linked to DBpedia(en): http://dbpedia.org/sparql
    Linked to Wikidata (en): http://query.wikidata.org/sparql
    Ready to answer question from the English edition of CORD-19 collection
    Loading bert-large-uncased-whole-word-masking-finetuned-squad model..
    model ready
    Loading deepset/roberta-base-squad2-covid model..
    model ready
    Loading deepset/roberta-base-squad2 model..
    model ready
    English answerer is ready
     * Serving Flask app "application.app" (lazy loading)
     * Environment: production
       WARNING: This is a development server. Do not use it in a production deployment.
       Use a production WSGI server instead.
     * Debug mode: off
     * Running on (Press CTRL+C to quit)

Server routes

The message body must contain the question field with the natural language question, and the query parameter evidence sets whether the summary generated has to be retrieved or not.

The availabe URIs are:

  • /muheqa/dbpedia : solve questions using the English edition of DBpedia.
  • /muheqa/wikidata: solve questions using the English edition of Wikidata.
  • /muheqa/cord19: solve questions using the Covid-19 Open Research Dataset.
  • /muheqa/all: solve questions using all sources of information.


To answer the question Where was Fernando Alonso born? using DBpedia:

 curl --location --request GET '' --form 'question="Where was Fernando Alonso born?"'

And the response:

 "answer": "Oviedo, Asturias, Spain",
 "confidence": 0.801,
 "evidence": {
 	"end": 149,
 	"summary": "  The car number of Fernando Alonso is 14.   The Last win of Fernando Alonso is 2013.   The birth place of Fernando Alonso is Oviedo, Asturias, Spain.   The name of Fernando Alonso is Fernando Alonso.   The First win of Fernando Alonso is 2003.   The last season of Fernando Alonso is 2018.   The birth name of Fernando Alonso is Fernando Alonso D\u00edaz.   The caption of Fernando Alonso is Alonso in 2016.   The First race of Fernando Alonso is 2001.   The image size of Fernando Alonso is 240.   The last win of Fernando Alonso is 2013 Spanish Grand Prix.   The nationality of Fernando Alonso is Spanish.   The title of Fernando Alonso is Fernando Alonso achievements, Fernando Alonso teams and series.   The first race of Fernando Alonso is 2001 Australian Grand Prix.   The 2021 Team of Fernando Alonso is Alpine F1, Renault in Formula One.   The source  of Fernando Alonso is Alonso's race engineer at Ferrari, Andrea Stella, on Alonso's ability and similarities to Michael Schumacher.   The first win of Fernando Alonso is 2003 Hungarian Grand Prix.  .  ",
 	"start": 126
 "question": "where was Fernando Alonso born?"