A GenAI tool that helps fight online fake content with fact-based generations
- Background:
- Began as an idea for Spark-2023 hackathon, a hackathon aimed for Holocaust memorial, fighting fake content and antisemitism
- We grouped as couple random people with same intent
- We met with key people including historians, youth ambassadors and other people who care and actively searching and reactive to hateful posts online.
- This is mostly manual process of discovery, although some are using online tools for monitoring and discovery.
- The number of people actively responding compared to the amount of written content is negligible.
- The numbers are expected to arise dramatically with the introduction of Generative AI.
- We understood we could combine GenerativeAI with historical documents containing historical facts to fight fake or misleading online posts. The focus is not to convince the person who wrote it, but the audience around the post.
- Won first place on a minimal working demo. The tool was trained on the Wikipedia page about the Holocaust and provided accurate answers.
- After closely inspecting how the minimal demo could transform into actual, public available tool, we understood:
- It has to support larger sources of facts and not rely on Wikipedia
- It has to be open-source
- It has to be completed before national Holocaust day (18-04-23)
- Goals:
- Create a simple tool that can analyze social media content (on-demand, per post) and suggest a fact-based response with ease
- Create a network of key people who want to see this tool alive and use it. Ask them to promote us and the tool
- Reach online and traditional news agencies to promote the project
- Reach morning tv shows to hows and promote
- Achieve a working tool by 16-04-23
- Plan & roles:
- Designer & product - handle landing-page, messaging, demo and install page
- Biz/Ops - handle strategical connections, PR, market validation and “sell” the tool
- FE - create chrome extension and FE part of the app
- BE - create app dedicated endpoints for processing Q&A, uploading files and managing facts (add remove facts)
- AI - Facilitate process of pre-processing of the unstructured data into structured list of facts, and design a utility bot that answers based on those facts (and maybe references back, tbd)
- Meta - promote the project in the local community, bring more contributors, etc
- PM - Manage the tasks and milestones, make decisions
- Team:
- Elya Livshitz - AI
- Ofer Nidam - Biz/Ops
- Ori Kessler - FS
- Yoni Kessler - Design/FS
- Stav Cohen Lasri - Product
- Ido Schwartz - FS
- Erez Tepper - AI/Meta
- Mentors/collaborators/partners:
- Yossi Klein
- Ram Yonish
- @בית לוחמי הגטאות (Yaron)
- "Facts managers" select relevant historical documents containing unstructured texts, multilingual, that describe historical events related to the Holocaust.
- They upload those files to Savee back-office webapp, it stores the files in the cloud
- Each file being "compressed" from large blobs of unstructured text into smaller list of concise fact list
- Each group of facts are aggregated or chunked into small chunks
- Performs embedding process for each chunk and stores each chunk with the embedding vector.
- Users on the internet see the landing page, navigate to install instructions page
- Install the Chrome extension
- Users (aka "Ambassadors") navigate to a page on social media that contains fake or wrong content
- User clicks on the extension and selects the post
- The extension grabs the text and sends it to Botify, a purpose-specific bot for factual Q&A
- Botify takes the text, embeds it and searches in the DB for facts semantically related to the text (using embedding vectors).
- Top k results taken and composed into GPT prompt
- Prompt is sent to OpenAI for completion, in such way that the response will be only based on the provided facts and nothing else.
- Resulted text is sent back to the user
- Clone locally
- Run
$ yarn install
- Run
$ yarn web
- Open browser on
http://localhost:3012/
- Check main package.json for more commands
Contributions welcome! Read the contribution guidelines first and submit a Pull Request after Fork this repository.
Wanna talk? Chat with us here.
Please use GitHub Issues to contact us.