- DrFAQ is a Question Answering NLP Chatbot for Text Document Corpora.
- Designed and implemented a NLP Question Answering architecture using spaCy, huggingface’s BERT language model, ElasticSearch, Telegram Bot API, and hosted on Heroku.
- Given an organisation's corpus of documents, generate a chatbot to enable natural question-answering capabilities.
Demo - t.me/DrFAQ_Bot
- Due to Heroku's free tier limits, only FAQ Question Matching using spaCy's Similarity and Answer Search using ElasticSearch functions are enabled.
- Demo implemented with information from National University of Singapore's University Scholars Programme website.
When a question is asked, the following processes are performed:
- FAQ Question Matching using spaCy's Similarity - /match
- From a given list of Frequently Asked Questions (FAQs), the chatbot detects similarity to the specified question and selects the best answer from the existing list.
- NLP Question Answering using huggingface's BERT - /nlp
- If the question asked is dissimilar to any existing FAQs, perform question answering on the knowledge base and return a sufficiently confident answer.
- Answer Search using ElasticSearch - /search
- If the answer is not sufficiently confident, perform a search on the document corpus and return the search results.
- Human Intervention
- If the search results are still not relevant, prompt a human to add the question-answer pair to the existing list of specified FAQs, or speak to a human.
- explosion/spaCy - Industrial-strength Natural Language Processing (NLP) with Python and Cython
- huggingface/transformers - Transformers: State-of-the-art Natural Language Processing for TensorFlow 2.0 and Pytorch
- elastic/elasticsearch-py - Official Python low-level client for Elasticsearch
- python-telegram-bot/python-telegram-bot - Python Wrapper for Telegram Bots
- google-research/bert - TensorFlow code and pre-trained models for BERT
- BERT - Pre-training of Deep Bidirectional Transformers for Language Understanding