/CentralBank-LLM

A custom RAG-LLM model to analyse central banking forecasts

Primary LanguagePythonGNU General Public License v2.0GPL-2.0

CentralBank-LLM (ChatGPT, Langchain)

🥈Plotly-Langchain Challenge Winner | View Webinar Recording
Discover what top banking institutions really think about macroeconomic conditions (inflation, interest rates, GDP, unemployment, oil prices). Plotly Dash App is built to extract main highlights and query latest central banking publications from:

The RAG-LLM retriever is trained to answer user prompts and generate plots exclusively using textual data downloaded directly from central banks. It also provides citations to the original publications (source, author, and page information).

Tutorial:

How to install & run the app:

  1. git clone https://github.com/viczommers/CentralBank-LLM.git
  2. change directory to cloned repository cd your-folder-path
  3. pip install -r requirements.txt
  4. python app.py
  5. open the app at http://localhost:8050/

Notes & Known Issues:

Langchain Chroma vectorstore deletion

ChromaDB is known to have issues in Streamlit/Dash apps when trying to delete collections or reset the database (if ChromaDB was used in a running callback). This behaviour results in new embeddings being appended to old db. Every Vectordb instance including as.retriever() needs to be set to None i.e. connection to Chroma terminated for the files to be released for deletion, once vectordb instance is set to None there is no way to reinitiate the database connection. This was somewhat addressed in old delete_downloads() function:

    if os.path.exists(persist_directory):
        shutil.rmtree(persist_directory, ignore_errors=True)

However, chroma.sqlite3 file will still exist and requires a Dash app restart to terminate the callback and release the file for deletion (or it can be deleted manually). The langchain.vectorstores module does not include functionality to perform client.reset() or track the collection, therefore I suggest using the chromadb library directly for similar projects. Those are the main reasons why webscraping, text splitting and embedding should be done using FastAPI backend server in MongoDB or Pinecone, and not locally.

400 Bad Request:

Frequent requests to Bank of England server will sometimes return less files than actually available due to cooldown. This can be remediated by changing IP Address or modifying the for-loop to request only reports from February, May, August, November (at expense of poentially missing emergency publications).

Interested in adding NLP tools like this to your trading terminal? Get in touch with us.

Disclosures:

THIS DOCUMENT DOES NOT CONTAIN INSIDE INFORMATION FOR THE PURPOSES OF ARTICLE 7 OF THE MARKET ABUSE REGULATION (EU) 596/2014 (“MAR”). THE MATERIAL GENERATED BY THE APP IS PROVIDED FOR INFORMATIONAL PURPOSES ONLY AND SHOULD NOT BE CONSTRUED AS INVESTMENT ADVICE OR SOLICITATION TO BUY OR SELL SECURITIES. THE MATERIAL IS NOT INTENDED TO BE USED AS A GENERAL GUIDE TO INVESTING, OR AS A SOURCE OF ANY SPECIFIC INVESTMENT RECOMMENDATIONS, AND MAKES NO IMPLIED OR EXPRESS RECOMMENDATIONS CONCERNING THE MANNER IN WHICH ANY CLIENT’S ACCOUNT SHOULD BE HANDLED.

© 2024 Lambda Capture Limited (Registration Number 15845351) 52 Tabernacle Street, London, EC2A 4NJ - All rights reserved