This is a Next.js project developed using Javascript which creates a semantic search engine for any Github repository given by the user. The program reads the markdown files, creates chunks of max size 500 and stores them in Upstash Vector database using Langchain and OpenAI Embeddings. Once the files are stored in the database users are able to search queries too.
If you have questions, you can check out the blog post which contains detailed explanation of the project from here. Also, feel free to create issues on the repository.
- Create an Upstash Vector Index
- Get an OpenAI API Key
To install the project on your local device in order to make changes or run it, you can follow these steps:
- Install the source code to your device
git clone https://github.com/kaanguneyli/semantic_search_for_docs.git
- Go to the project folder
cd semantic_search_for_docs
- Install
next
if not installed already
npm install next
- Create a
.env
file and fill it with your API keys.
# .env
UPSTASH_VECTOR_REST_URL="..."
UPSTASH_VECTOR_REST_TOKEN="..."
OPENAI_API_KEY="..."
- Run the project
npm run dev
- Go to
https://localhost:3000/
orhttps://localhost:3000/api/search?query=your-input
Once you run the program, you will see two forms and one submit button.
First form will have the prompt Owner name
and second will have Repo name
. Fill these spaces according to your Github repository (For example 'Upstash' and 'Docs'). The program will create the embeddings and upsert them to your index.
If you use the search endpoint the program will return 3 results which resemble most to your query, you can also see the file's names on the screen.
You can deploy to project using the Vercel Platform from the creators of Next.js using the button below.