File Browser Doesn't Show Files
nategoethel opened this issue · 19 comments
I can't seem to select an individual file in that folder though.
Originally posted by @nigelp in #4 (comment)
Hi! Right now the app only supports selecting folders, so if you select the folder in which the document is it should load all the contents. I know its a bit counterintuintive and am working on that!
Also some people have reported issues running the LLM on windows, please let me know if you have the same problem
Yeah, basically it's not working on Windows at all as far as I can tell.
hmmmm thanks for letting me know, still havent found the source of the issue... I'll pull the windows version from the website for now and will keep you updated.
The fact the documents take a while to load would indicate that the python backend is working and is connected to the rest of the app, maybe the issue is related to the python dependencies needed to run the LLM or the LLM file itself, if it is the former I'd imagine it probably is llama-cpp and its CUDA configuration. If you have installed Dot in your desktop could you please run the following commands (adjust the path for your system):
& 'C:\Users\Desktop\Dot\resources\llm\python\python.exe' -m pip uninstall llama-cpp-python
& 'C:\Users\Desktop\Dot\resources\llm\python\python.exe' -m pip install llama-cpp-python
This would reinstall the llama-cpp library used to run the LLM and remove the GPU accelerations settings, this would of course make the program slower but if it works it would indicate where the problem is. Please let me know if you can try this and if it works! :)
According to the llama-cpp-python documentation here it appears that Visual Studio is required to install the library, this is because a C compiler is required to run llama-cpp
OK, well it slammed my CPU for a while but worked this time. Produced a summary of a 1 page docx in around 50 seconds. But I'm running an i7 with an 8GB RTX4060, so it probably should be a little faster? But at least it's working now. :)
That's amazing news! Thank you so much for the help, now I at least now where the problem comes from! If you want to try to set it up for use with GPU acceleration the following steps should work:
1- Install CUDA toolkit: link
2- Uninstall llama-cpp-python: & 'C:\Users\Desktop\Dot\resources\llm\python\python.exe' -m pip uninstall llama-cpp-python
3- Reinstall it with the following command: & "C:\Users\Desktop\Dot\resources\llm\python\python.exe" -c "import os; os.environ['CMAKE_ARGS'] = '-DLLAMA_CUBLAS=on'; os.environ['FORCE_CMAKE'] = '1'; import pip._internal; pip._internal.main(['install', '--upgrade', '--force-reinstall', 'llama-cpp-python', '--no-cache-dir'])"
Hmm..thanks. You had an extra " after the python.exe but even when I fixed that it failed again. You're probably going to need the whole file this time. :)
llamafail.txt
OK fixed it, got it working, kind of - thanks to Claude 3 Opus :) Reinstalled CUDA and at least it seems to access something.
BUT it can't access the PDFs. Tried both local and the Big Dot and both failed. Big dot hallucinated an answer after saying "I'm unable to access or view the Q Star document directly. However, I can share with you..."
Nice! That is really interesting, big dot's answer makes sense as it is meant for general use and is not aware of the documents, did Doc Dot give any answer at all or was it stuck again?
Also how are you finding Claude 3? Is it really better than GPT4?
Hmm...the Doc Dot answer was three lines of lame. :) Started with "I cannot directly access or summarize a PDF from this text alone. However, I can tell you that..." It's a shame because the UI and potential for adoption is really good. But it's not reading the doc at all it seems.
Claude Opus is SO much better than GPT4. It's my daily driver now.
Hmmmm there's a few options I can think of that could cause the issue here:
1- It does not have any access to the PDF: In such a case it would reply something along the lines of "I do not have the answer to that, the text only mentions "foo" and "bar""
2- Maybe it did not understand the prompt properly: Depending on the prompt Dot can get confused, because of the way Dot works it only has access to the text inside the document, so asking "what is doc X about?" might lead to the model searching for references of "doc X" inside the document itself and not finding anything. (Making it more aware of the documents themselves is something im trying to figure out but its turning to be quite a challenge)
3- It's replies are complete nonsense: In such a case it probably has more to do with the LLM itself than the embeddings but this only seems to happen when there are issues with the context length which as far as I understand shouldn't be the case in Doc Dot.
Do any of these align with what you are seeing? Also really tempted to get Claude 3, especially with all the uni coursework I have lately lol :)
Interesting. Yes that would make sense about the prompting. I've just been asking 'summarise the document', and the results I get back are not useful. Do you have some sort of base prompt in the background you could tweak perhaps? Like 'you are a helpful assistant who can read the contents of documents and blah'? Or you're probably doing that already? Maybe you should offer some optimum prompts for users to try to get best results? Let me send over a full answer response
And yes, go for Claude 3. It's so refreshing to get actual useful answers instead of the 'I'm sorry I am only...' rubbish from GPT4 nowadays.
The current base prompt is "Use the following pieces of context to answer the question at the end. If you do not know the answer, just say you don't know, don't try to make up an answer." And looking at it I can see why it might be confused by a 'summarise the document' as the prompt only mentions 'context'.
I will try to modify it and see if that changes anything, you can also modify it if you want as the python scripts are easily accessible from within the app files, they should be in \resources\llm\scripts
and the system prompt is in line 77 of the docdot.py file.
Honestly this is the funniest part of working with LLMs (however annoying it can be), giving actual human like prompts that influence the way code behaves is quite surreal.
OK, did some tests.
- Asked Claude to create a suitable prompt. Did a very long version. I asked it to shorten the prompt and got:
"You are an AI document analysis and question-answering system. Your task is to read and comprehend documents, store the information, and provide accurate, relevant answers to user questions based on the document's content. When a question is asked, analyze it, search your memory for the most relevant information, and formulate a clear, concise response. If the document does not contain sufficient information to answer the question, inform the user accordingly."
-
Tried loading a document (15MB PDF) about Q-Star. Attached.
-
Asked this question: "list out the key points in this document"
-
Got this answer (much hallucinated):
To list out the key points in this document, I would need to carefully read and understand the content of the document. Some potential key points from the provided text include:
Scott Aaronson wrote a letter to his 11-year-old self.
He discussed how he had always been interested in science and technology.
He mentioned that he had recently started learning about quantum mechanics.
He expressed excitement about the potential discoveries and innovations that could come from this field of study.
Alberto Romero posted about the general response to this text.
The response was generally positive, with many people expressing excitement and inspiration from Scott Aaronson's letter to his younger self.
Some people also shared their own experiences and stories related to science, technology, and personal growth.
Overall, the response was a testament to the power of curiosity, learning, and self-reflection in shaping our lives and the world around us.
I think the LLM is not up to the task.
q star info.pdf
- Interesting fact: this is the most information I've got out of Dot so far. So the prompt seems to have improved the response. Except it's completely bogus! :)
That is really interesting thank you very much for your help! But yeah the context length of local models is not yet up to the task :/
At only 8000 tokens of context length it is taking its answer from the chunk of text it considers 'most relevant' to the question asked so in cases where lots of information is required it is no powerful enough...
Of course this should get better as LLMs become more advanced and I'd expect big improvements at the pace things are going. Something that could be done would be adding support for non-local LLMs using the OpenAI API for example, this would massively increase the context length but I'm not sure if that makes much sense as there are already tools that do that.
That's what I thought might be the problem. Have you tried other models? Or maybe an API solution to an open source model? I'm running openhermes-2.5-mistral-7b.Q3_K_M locally via Koboldcpp and it's quite fast. But I guess there's added load with vision?