microsoft/RAG_Hack

Project: Rag News Hack

Opened this issue · 1 comments

Project Name

Rag News Hack

Description

Rag News

Fake news is a hot topic all over the world, in special when we are dealing with political affairs.
The objective of Rag News is to create a curated repository of trustworthy news that can then be used to verify if a given information is true or not, based on the information we have.

Below are the principal points in the solution

Multiple databases support

The development of RagNews served as a base application for the start of the development of a modular to rag (and LLM agents in general). The main objective of this framework is to offer an Object Oriented, reusable environment, where configurations can be saved in a backend (JSON in this case), and other components can be exchanged as needed (changing one rag backend for another, for example).
Because of this modular approach, Rag News supports all four databases that were discussed during this hack (Azure AI Search, PostgreSQL, Cosmos DB Mongo DB and SQLServer), implementing the concepts and techniques discussed in the presentations.

Agentic query strategy

The implementation of the "query translation" in this solution differs significantly from what was discussed during the hack events. We decided to give complete freedom to LLM to query the RAG database as it saw feet and how many times it wanted. We did this to give it more freedom to search if a given information is true or not.

Review agent

Other strategy employed in this project was the use of one additional agent in the flow. In this case the additional agent was implemented as a "tool" that the main LLM could use when it sees fit. The objective of this additional LLM is to provide better insights and help the main LLM write a "good" answer.

Multiple queries at the same time

We have also given the agent the option to send a list of terms to search, instead of just one. Based on this list of terms we do multiple searches on the RAG store, and re-rank the results using RRF. The objective of this approach is to try to get even better results, assuming that the combination of queries should filter the intersection of documents that are more important (this is pending evaluation, see next point)

Points of improvement

  • Improve the system prompt for the main agent;
  • Implement a RAG evaluation strategy to consistently track improvements on changes;
  • Implement different chunking strategies;
  • Implement a Graph Knowledge base on the docs database so that when doing the RAG stage we can search for "related" entities.
  • Implement another step on the process that will confirm that the information generated by the LLM is, indeed, grounded on the sources it cites.

Final notes

This application was completely developed during the RagHack, and as such does not possess a very nice and intuitive user interface.
Gradio was used to quick prototype and test the user interface.

Technology & Languages

  • JavaScript
  • Java
  • .NET
  • Python
  • AI Studio
  • AI Search
  • PostgreSQL
  • Cosmos DB
  • Azure SQL

Project Repository URL

https://github.com/KhaoticMind/rag_news_hack

Deployed Endpoint URL

N/A

Project Video

https://youtu.be/A1BnRP_lVzU and https://youtu.be/ddk8s26UjJ8

Team Members

KhaoticMind

Hello @KhaoticMind, thank you for participating in RAG Hack!

The team is working hard to distribute badges. Please have each team member fill out this form:
aka.ms/raghack/badge-dist

Thank you!