Often got errors when loading a PDF or asking questions

Question

Often got errors when loading a PDF or asking questions

Opened this issue a year ago · 3 comments

Hi Sasmitha,

First good work! I am very impressed! I came up the similar idea and then I found your code. Later on I found chatpdf.com. They are taking the idea to a fine product. I have tried both your code and chatpdf. Their code is quite stable. They also have some new features, for example, after loading the PDF, it has GPT to come up three questions.

The problem that I had with your code is that it often comes up some errors: sometime at the time to load the PDF, sometime when asking questions. I tried the same file in chatpdf, they have no problem. Here is an example: I have attached the PDF file and you will see the following error message when you load it. I hope you can figure out what is the problem and fix it.

Thanks!

Leo

Share

ChatDoc - The AI Bot Answering Your Questions based on a Document
Upload a PDF file, then you can ask questions, our ChatGPT will answer questions based on the document

Drag and drop file here
Limit 200MB per file • PDF
Browse files
Chris_Mack_PhD_Thesis.pdf
0.7MB

openai.error.RateLimitError: This app has encountered an error. The original error message is redacted to prevent data leaks. Full error details have been recorded in the logs (if you're on Streamlit Cloud, click on 'Manage app' in the lower right of your app).
Traceback:
File "/home/appuser/venv/lib/python3.9/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 565, in _run_script
exec(code, module.dict)
File "/app/chatdoc/app.py", line 18, in
index = embed_text(parse_pdf(uploaded_file))
File "/home/appuser/venv/lib/python3.9/site-packages/streamlit/runtime/legacy_caching/caching.py", line 627, in wrapped_func
return get_or_create_cached_value()
File "/home/appuser/venv/lib/python3.9/site-packages/streamlit/runtime/legacy_caching/caching.py", line 611, in get_or_create_cached_value
return_value = non_optional_func(*args, **kwargs)
File "/app/chatdoc/utils.py", line 32, in embed_text
index = FAISS.from_texts(texts, embeddings)
File "/home/appuser/venv/lib/python3.9/site-packages/langchain/vectorstores/faiss.py", line 193, in from_texts
embeddings = embedding.embed_documents(texts)
File "/home/appuser/venv/lib/python3.9/site-packages/langchain/embeddings/openai.py", line 87, in embed_documents
responses = [
File "/home/appuser/venv/lib/python3.9/site-packages/langchain/embeddings/openai.py", line 88, in
self._embedding_func(text, engine=self.document_model_name)
File "/home/appuser/venv/lib/python3.9/site-packages/langchain/embeddings/openai.py", line 76, in _embedding_func
return self.client.create(input=[text], engine=engine)["data"][0]["embedding"]
File "/home/appuser/venv/lib/python3.9/site-packages/openai/api_resources/embedding.py", line 33, in create
response = super().create(*args, **kwargs)
File "/home/appuser/venv/lib/python3.9/site-packages/openai/api_resources/abstract/engine_api_resource.py", line 153, in create
response, _, api_key = requestor.request(
File "/home/appuser/venv/lib/python3.9/site-packages/openai/api_requestor.py", line 226, in request
resp, got_stream = self._interpret_response(result, stream)
File "/home/appuser/venv/lib/python3.9/site-packages/openai/api_requestor.py", line 619, in _interpret_response
self._interpret_response_line(
File "/home/appuser/venv/lib/python3.9/site-packages/openai/api_requestor.py", line 679, in _interpret_response_line
raise self.handle_error_response(
Chris_Mack_PhD_Thesis.pdf

Answer 1 · 2023-03-22T03:36:34.000Z

Hey, thanks for reporting this issue.
I'm probably guessing that the rate limit on the free OpenAI API key is causing the problem.
You can either use a paid API key or implement some sort of retrying mechanism to fix this.
Take a look at the source code for KnowledgeGPT (This is a more advanced version of doc-qa) to see how you can implement it.

Answer 2 · 2023-03-22T05:14:26.000Z

Sasmitha, Thanks for the quick response! I'll upgrade to the paid API key to see if this problem goes away. I actually tried your KnowlegeGPT first. I couldn't make it work neither locally, nor through Streamlit community cloud. I posted my problem in that repository. Did you see it? If you can provide a more detailed instruction on how to clone it in Streamlit community cloud, that would be great! Thanks! Leo

…

On Tue, Mar 21, 2023 at 8:36 PM Sasmitha Manathunga < ***@***.***> wrote: Hey, thanks for reporting this issue. I'm probably guessing that the rate limit on the free OpenAI API key is causing the problem. You can either use a paid API key or implement some sort of retrying mechanism to fix this. Take a look at the source code for KnowledgeGPT <https://github.com/mmz-001/knowledge_gpt> (This is a more advanced version of doc-qa) to see how you can implement it. — Reply to this email directly, view it on GitHub <#2 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ANM24YWESYIFGEL7UU5KBWDW5JXU3ANCNFSM6AAAAAAWDGAJGU> . You are receiving this because you authored the thread.Message ID: ***@***.***>

Answer 3 · 2023-03-22T05:19:48.000Z

BTW, have you tried using the PDF file that I attached? I just tried it again, I got the same error message. Then I tried another PDF and asked a few questions, it works. That's why I suspect that PDF has some special things that caused problems in your code? Leo

…

On Tue, Mar 21, 2023 at 10:14 PM Leo Pang ***@***.***> wrote: Sasmitha, Thanks for the quick response! I'll upgrade to the paid API key to see if this problem goes away. I actually tried your KnowlegeGPT first. I couldn't make it work neither locally, nor through Streamlit community cloud. I posted my problem in that repository. Did you see it? If you can provide a more detailed instruction on how to clone it in Streamlit community cloud, that would be great! Thanks! Leo On Tue, Mar 21, 2023 at 8:36 PM Sasmitha Manathunga < ***@***.***> wrote: > Hey, thanks for reporting this issue. > I'm probably guessing that the rate limit on the free OpenAI API key is > causing the problem. > You can either use a paid API key or implement some sort of retrying > mechanism to fix this. > Take a look at the source code for KnowledgeGPT > <https://github.com/mmz-001/knowledge_gpt> (This is a more advanced > version of doc-qa) to see how you can implement it. > > — > Reply to this email directly, view it on GitHub > <#2 (comment)>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/ANM24YWESYIFGEL7UU5KBWDW5JXU3ANCNFSM6AAAAAAWDGAJGU> > . > You are receiving this because you authored the thread.Message ID: > ***@***.***> >