Welcome to our demo :)
This requires that you have installed and configured Miniconda in Windows. The second step that you have an environment-win.yaml file already setup with channels and decencies.
I recommend using Powershell for these commands. If conda works you will be able to see (base) in front of your path. And dont forgett to activate your environment.
-
Run start.bat ?
.\Start.bat
-
Run start.bat.
.\install_pnpm.ps1
-
Restart your shell
-
Activate your env
conda activate Oracle-Demo-1
-
Clone the repo
git clone https://github.com/Chugarah/gpt4-pdf-chatbot-langchain.git cd gpt4-pdf-chatbot-langchain pnpm install pnpm add sharp
-
Set up your .env file
- Copy .env.example into .env Your .env file should look like this:'
OPENAI_API_KEY= PINECONE_API_KEY= PINECONE_ENVIRONMENT= PINECONE_INDEX_NAME= ANSWER_LANGUAGE=
-
We need to update two things: the Pinecone index name and namespaces. Namespaces are the folders you have in your docs folder. Example:
# This is a namespace. ==> space-sci docs/space-sci
We need to edit two files: config/pinecone.ts.
-
In the
config/pinecone.ts
folder, change thePINECONE_INDEX_NAME
with aIndex Name
you created in Pinecone. Exampleexport const PINECONE_INDEX_NAME = 'demo-data';
-
Now, we need to add or remove namespaces based on your docs folder. Remember that the namespaces need to exactly match the folder names. For example:
export const TOPICS = [ ## Name venus-atmosphere-life { TOPIC: 'Life in the Atmosphere of Venus', NAMESPACE: 'venus-atmosphere-life', // MUST ONLY CONTAIN LOWER CASE LETTERS A-Z AND HYPHENS PROMPT: 'What evidence is there that life exists in the atmosphere of Venus?', }, # supreme-court-cases { TOPIC: 'Supreme Court Cases', NAMESPACE: 'supreme-court-cases', // MUST ONLY CONTAIN LOWER CASE LETTERS A-Z AND HYPHENS PROMPT: 'What precedent was set by Morse v. Frederick?', }, ];
-
-
In
utils/makechain.ts
chain change theQA_PROMPT
for your own usecase. ChangemodelName
innew OpenAIChat
togpt-3.5-turbo
, if you don't have access togpt-4
. Please verify outside this repo that you have access togpt-4
, otherwise the application will not work with it.
This repo can load multiple PDF files :)
-
Inside
docs
folder, add your pdf files or folders that contain pdf files. -
Run the script
npm run ingest
to 'ingest' and embed your docs. If you run into errors troubleshoot below. -
Check Pinecone dashboard to verify your namespace and vectors have been added.
Once you've verified that the embeddings and content have been successfully added to your Pinecone, you can run the app pnpm run dev
to launch the local dev environment, and then type a question in the chat interface.
You can now run the app using Docker container. Start your favorite terminal and run these commands.
cd gpt4-pdf-chatbot-langchain/docker
# Build the image
docker compose build
# Run the container
docker compose up
# To build app
pnpm run build
# To start app
pnpm run start
# For development
npm run dev
This is different you can run the App. We have two options. The first one is to generate vector data to feed into Pinecone. The second one is to run the webserver and the chatbot.
-
Start Powershell or your favorite terminal
-
Run Shell Command
conda activate Oracle-Demo-1
-
Navigate to your project folder
cd gpt4-pdf-chatbot-langchain
-
Run the Vector Generator. This is when you want to upload your document to Pinecone.
npm run ingest
-
To run the webserver and the chatbot
pnpm run build
In general, keep an eye out in the issues
and discussions
section of this repo for solutions.
General errors
-
Make sure you're running the latest Node version. Run
node -v
-
Try a different PDF or convert your PDF to text first. It's possible your PDF is corrupted, scanned, or requires OCR to convert to text.
-
Console.log
theenv
variables and make sure they are exposed. -
Make sure you're using the same versions of LangChain and Pinecone as this repo.
-
Check that you've created an
.env
file that contains your valid (and working) API keys, environment and index name. -
If you change
modelName
inOpenAIChat
note that the correct name of the alternative model isgpt-3.5-turbo
-
Make sure you have access to
gpt-4
if you decide to use. Test your openAI keys outside the repo and make sure it works and that you have enough API credits. -
Check that you don't have multiple OPENAPI keys in your global environment. If you do, the local
env
file from the project will be overwritten by systemsenv
variable. -
Try to hard code your API keys into the
process.env
variables. Pinecone errors -
Make sure your pinecone dashboard
environment
andindex
matches the one in thepinecone.ts
and.env
files. -
Check that you've set the vector dimensions to
1536
. -
Make sure your pinecone namespace is in lowercase.
-
Pinecone indexes of users on the Starter(free) plan are deleted after 7 days of inactivity. To prevent this, send an API request to Pinecone to reset the counter before 7 days.
-
Retry from scratch with a new Pinecone project, index, and cloned repo.
Frontend of this repo is inspired by langchain-chat-nextjs