/llm-simple-QnA-example

This repository contains experiments with LLMs

Primary LanguageJupyter Notebook

LLM simple QnA example

Python Code style: black

This repository contains the example of the simple QnA system based on the LLM and LangChain. As the vector search engine I have used Qdrant.

Setup python environment

Note

This project is based on Python 3.10 and uses poetry to manage dependencies.

  1. Clone the repository using git clone command.
  2. Open the terminal and go to the project directory using cd command.
  3. Create virtual environment using python -m venv venv or conda create -n venv python=3.10 command.
  4. Activate virtual environment using source venv/bin/activate or conda activate venv command.
  5. Install poetry using instructions from here. Use with the official installer section.
  6. Set the following option to disable new virtualenv creation:
    poetry config virtualenvs.create false
  7. Install dependencies using poetry install --no-root command. The --no-root flag is needed to avoid installing the package itself.
  8. Setup pre-commit hooks using pre-commit install command. More information about pre-commit you can find here.
  9. Run the test to check the correctness of the project work using following command:
    python -m unittest -b
  10. After successful passing of the tests, you can work with the project!
  11. If you want to add new dependencies, use poetry add <package_name> command. More information about poetry you can find here.
  12. If you want to add new tests, use unittest library. More information about unittest you can find here. All tests should be placed in the tests directory.
  13. All commits should be checked by pre-commit hooks. If you want to skip this check, use git commit --no-verify command. But it is not recommended to do this.
  14. Also, you can run pre-commit hooks manually using pre-commit run --all-files command.
  15. More useful commands you can find in Makefile.

Examples

Warning

You should have wget to files and docker to run Qdrant and Redis services.

How to start?

Important

Knowledge base for RAG consists of two parts: pdf files and data from the wikipedia. First part will be downloaded manually by the script and second part will be downloaded from the code.

  1. Run make download_dataset command to download the pdf files. Those files will be placed in the data directory and it only a one part of the full dataset that will be used.
  2. Run make run_qdrant command to start the Qdrant service. It will be available on http://localhost:6333 address.
  3. Run make run_redis command to start the Redis service. It will be necessary to cache the LLM prompts and requests in step 5.
  4. Run the notebook notebooks/01-open-source-llms-langchain.ipynb to download the full dataset, index it and run the QnA example with local LLMs. This notebook show only simple example of the QnA system.
  5. Run the notebook notebooks/02-open-ai-langchain-base.ipynb to download the full dataset, index it and run the QnA example with OpenAI LLMs. This example is more complex and show how to evaluate the QnA system.
  6. I can run REST API for QnA system using make run_api command. Swagger will be available on http://localhost:8000/docs address.

Useful links

Courses