mts-ai-nlp

NLP project for MTS AI

Description

This system, designed for efficient airline ticket booking, consists of three main components: the Q4 quantized Mistral-7B-Instruct-v0.1 language model, the ChromaDB database system, and a user name extraction module powered by bert-large-NER. The language model processes and responds to user requests, while the user name extraction module, utilizing a fine-tuned BERT model, accurately identifies user names from inputs. The ChromaDB system stores and retrieves user ticket data, initially held in a pandas dataframe with flights information for efficient manipulation. These components work together to automate ticket booking, providing a personalized user experience.

Scripts:

  1. bert_ner.py - A fine-tuned BERT model for entity recognition
  2. chat.py - Runs the chat
  3. embedder.py - An embedding sup-simcse-roberta-large model to operate with text in vector db
  4. evaluator.py - Evaluates the model answer correctness
  5. flights_db_filler.py - Fills the database with synthetic data
  6. flights_db.py - A class to operate on the pandas flights dataframe
  7. llm.py - A class to interact with the language model
  8. tickets_db.py - A class to operate on the ChromaDB database
  9. utils.py - Utility functions for features extraction from text

Video Demo:

Project video demonstration

Notes:

This system uses the Q4 version of LLM through the LLAMA_cpp_python binding. Other Language Models work through the transformers library.

Usage: