honours-project-backend: A Python repository from migbash-university

📜 About / Description

This is an Edinburgh Napier University Honours Project for (BSc.) Computer Science Degree.

This Honours Project revolves around Conversational A.I. Agents to be used in the Educational System for the correct interaction with humans to pass across knowledge. However, due to the lack of existing real-world literature and research on multi-modal learning approaches, such as: visual - audio and interactive charts methods of learning, this honours project aims to understand the imapct of the use of a Conversational A.I. Agent to pass across knowledge to the user on a scientific topic and companre it with standard and more common methods of learning.

This repository is only half-of the project. It focuses on the development of the conversational AI agent and the outline of its development. The UI/UX in a web-app format can be accessed on the following link -> [insert-link-here] and for the open-source code for it can be found here -> [insert-link-here]

This project is a conversational-agent tailored on the builing of a space-education website.

📊 Project Graph Overview [ChatBot]

📃 Data

The data used for the project comes from the main source -> NASA. Using the correct links and data.

Due to lack of open-end available APIs for consuremrs, the data gathered on Titan has been done manually and compiled down into a 3 page large text accessible in the following link: link. This data has been used throughout the entirety of the testing phase for the participants.

🐳 Dockerized

This application has been developed as a Docker Application , so it can be just deployed wherever needed quickly.

🚀 Get Started

To get started with the development of the project, you can follow the following:

🛠 Development

First, clone the repository,

git clone https://github.com/migbash-university/honours-project-backend

Update the requirments.txt file using the following command:

https://drgabrielharris.medium.com/python-how-create-requirements-txt-using-pipenv-2c22bbb533af

🚀 Production and Deployment

To deploy this applciation, the use of Google Cloud is used. For this, simply run the following command:

gcloud run deply

and this will deploy to your Google Cloud Account.

📚 Project Dependecies

This project has been built and developed using the following libraries and modules,

📂 Project Structure

This project is broken down into the following structure and crucial files:

.python-version - contains the pyenv-win version that is used for this project.
app.py - contains the main project entry for the project and the necessary endpoints to make the application run and work.
Pipfile + Pipfile.lock - are the main files essential for the interaction with the pipenv and the further interaction of the virtualenv.
bert_env - contains the data for the BERT development and the necessary data for its operation.

🤔 Why certain technologies?:

Diving deeper in the understanding of the project technology stack and design decisions:

📌 Implementation #1 [w/ BERT]

This implementation project is a use of BERT technology by HuggingFace with their transformers library for the users to use. The model is a pre-trained model that by HuggingFace obtained from the list of existing models here.

The project has also been implemented as a GoogleCollab Notebook, here.

https://huggingface.co/deepset/roberta-base-squad2?context=My+name+is+Wolfgang+and+I+live+in+Berlin&question=Where+do+I+live%3F

Why PyTorch vs. Tensorflow?

The main reason for the use of PyTorch instead that of Tensorflow is the use: PyTorch has long been the preferred deep-learning library for researchers, while TensorFlow is much more widely used in production. PyTorch's ease of use makes it convenient for fast, hacky solutions and smaller-scale models. - ref

What is Question Answering (QA) ?

Question-answering is the ability for a conversational agent to answer a particular question a user may have, based on a context (or a passage). This can be further explored and has been explained by the following article 1

Why not use RASA ?

The use of RASA has been omitted in the development cycle of this project, due to the use of a simpler approach to a Question-Answering model, using solely BERT for question-answering model. However, examples of the integration of RASA -> BERT exist and can be found here.

Why pipenv and pyenv-win ?

Pipenv: To learn more about how to use the pipenv python module for better module management and versioning of the python libraries, please consult the following guide here.
Pyenv: Learn more about how to manage different python versions in one instance evironment of python usong pyenv. tutorial tutorial-2. Useful pynev commands -> commands

Why Flask-API-RESTful Service ?

The use of a Flask-API is used for the correct interaction over the web-api with the Conversational A.I. agent with the correct method. To validate the correct use of the flask and the methods used, the pytest module is used -> (https://flask.palletsprojects.com/en/2.0.x/testing/)