PDFToChat

Chat with your PDFs in seconds. Powered by Together AI and Pinecone.

Tech Stack · Deploy Your Own · Common Errors · Credits · Future Tasks

Tech Stack

Next.js App Router for the framework
Mixtral through Together AI inference for the LLM
M2 Bert 80M through Together AI for embeddings
LangChain.js for the RAG code
Pinecone or MongoDB Atlas for the vector database
Bytescale for the PDF storage
Vercel for hosting and for the postgres DB
Clerk for user authentication
Tailwind CSS for styling

Deploy Your Own

You can deploy this template to Vercel or any other host. Note that you'll need to:

Set up Together.ai
Set up a Pinecone or MongoDB Atlas Atlas database with 768 dimensions
- See instructions below for MongoDB
Set up Bytescale
Set up Clerk
Set up Vercel
(Optional) Set up LangSmith for tracing.

See the .example.env for a list of all the required environment variables.

You will also need to prepare your database schema by running npx prisma db push.

MongoDB Atlas

To set up a MongoDB Atlas database as the backing vectorstore, you will need to perform the following steps:

Sign up on their website, then create a database cluster. Find it under the Database sidebar tab.
Create a collection by switching to Collections the tab and creating a blank collection.
Create an index by switching to the Atlas Search tab and clicking Create Search Index.
Make sure you select Atlas Vector Search - JSON Editor, select the appropriate database and collection, and paste the following into the textbox:

{
  "fields": [
    {
      "numDimensions": 768,
      "path": "embedding",
      "similarity": "euclidean",
      "type": "vector"
    },
    {
      "path": "docstore_document_id",
      "type": "filter"
    }
  ]
}

Note that the numDimensions is 768 dimensions to match the embeddings model we're using, and that we have another index on docstore_document_id. This allows us to filter later.

You may call the index whatever you wish, just make a note of it!

Finally, retrieve and set the following environment variables:

NEXT_PUBLIC_VECTORSTORE=mongodb # Set MongoDB Atlas as your vectorstore

MONGODB_ATLAS_URI= # Connection string for your database.
MONGODB_ATLAS_DB_NAME= # The name of your database.
MONGODB_ATLAS_COLLECTION_NAME= # The name of your collection.
MONGODB_ATLAS_INDEX_NAME= # The name of the index you just created.

Common errors

Check that you've created an .env file that contains your valid (and working) API keys, environment and index name.
Check that you've set the vector dimensions to 768 and that index matched your specified field in the .env variable.
Check that you've added a credit card on Together AI if you're hitting rate limiting issues due to the free tier

Credits

Youssef for the design of the app
Mayo for the original RAG repo and inspiration
Jacob for the LangChain help
Together AI, Bytescale, Pinecone, and Clerk for sponsoring

Future tasks

These are some future tasks that I have planned. Contributions are welcome!