/langchain_semantic_search

Search and indexing your own Google Drive Files using GPT3, LangChain, and Python

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

Search and indexing your own Google Drive Files using GPT3, LangChain, and Python.

The jupyter notebook included here (langchain_semantic_search.ipynb) will enable you to build a FAISS index on your document corpus of interest, and search it using semantic search. Details of this flowchart are described in https://medium.com/@venuv62/can-chatgpt-be-your-bff-code-companion-4375fd73ec3a.

image

I've provided a test directory of Neuromodulation papers if you want to as a sample Drive folder to test against - https://drive.google.com/drive/folders/1eIBnSO7MVOW9-BKPCJhs7JuBDRyXPOFC?usp=sharing. Since the code needs a Google Drive directory path (not an https URL) to work with, you will have to :

  • copy the contents of this directory into a GDrive subdirectory of your own
  • set the gdrive_path variable in the jupyter notebook appropriately
  • set the question within print_answer to 'is sleep a health epidemic' for instance, which should give you a non-null answer

I will be working on a few enhancements to speed up the indexing (perhaps using a Vectorstore) and to optimize the query cost (using ideas from https://gpt-index.readthedocs.io/en/latest/how_to/cost_analysis.html)