/DrFAQ

Question Answering NLP Chatbot for Text Document Corpora, built using spaCy, huggingface’s BERT language model, ElasticSearch, Telegram Bot API, and hosted on Heroku.

Primary LanguagePython

DrFAQ

  • DrFAQ is a Question Answering NLP Chatbot for Text Document Corpora.
  • Designed and implemented a NLP Question Answering architecture using spaCy, huggingface’s BERT language model, ElasticSearch, Telegram Bot API, and hosted on Heroku.

Objective

  • Given an organisation's corpus of documents, generate a chatbot to enable natural question-answering capabilities.

  • Due to Heroku's free tier limits, only FAQ Question Matching using spaCy's Similarity and Answer Search using ElasticSearch functions are enabled.
  • Demo implemented with information from National University of Singapore's University Scholars Programme website.

Methodology

When a question is asked, the following processes are performed:

  1. FAQ Question Matching using spaCy's Similarity - /match
    • From a given list of Frequently Asked Questions (FAQs), the chatbot detects similarity to the specified question and selects the best answer from the existing list.
  2. NLP Question Answering using huggingface's BERT - /nlp
    • If the question asked is dissimilar to any existing FAQs, perform question answering on the knowledge base and return a sufficiently confident answer.
  3. Answer Search using ElasticSearch - /search
    • If the answer is not sufficiently confident, perform a search on the document corpus and return the search results.
  4. Human Intervention
    • If the search results are still not relevant, prompt a human to add the question-answer pair to the existing list of specified FAQs, or speak to a human.

References