ParisNeo/lollms-webui

Does not response with any text after send the question. Version 0.0.5

ziyunxiao opened this issue · 4 comments

Expected Behavior

After sending the question, "tell me a joke", it should return some joke.

Current Behavior

It returns empty.

Steps to Reproduce

  1. Run webui.sh
    (notes, webui.sh has two issues. a) the line ends is Windows, need to convert to Linux. b) model download folder is wrong, needs update to./models/llama_cpp/)
  2. Open browser, http://localhost:9600/
  3. Type question "tell me a joke"
  4. click send

Possible Solution

If you have any suggestions on how to fix the issue, please describe them here.

Context

Please provide any additional context about the issue.

Screenshots

[2023-04-24 09:11:31,405] {_internal.py:224} INFO - 127.0.0.1 - - [24/Apr/2023 09:11:31] "POST /generate HTTP/1.1" 200 -
Generating 1024 outputs... 
Input text : GPT4All is a smart and helpful Assistant built by Nomic-AI. It can discuss with humans and assist them.
### Assistant:Welcome! I am GPT4All A free and open assistant. What can I do for you today?
### Human:tell me a joke
### Assistant:
llama_generate: seed = 1682349091

system_info: n_threads = 8 / 64 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 | 
sampling: temp = 0.900000, top_k = 50, top_p = 0.950000, repeat_last_n = 40, repeat_penalty = 1.200000
generate: n_ctx = 512, n_batch = 8, n_predict = 1024, n_keep = 0


 [end of text]

llama_print_timings:        load time = 41434.44 ms
llama_print_timings:      sample time =    16.06 ms /    33 runs   (    0.49 ms per run)
llama_print_timings: prompt eval time =  4391.98 ms /    74 tokens (   59.35 ms per token)
llama_print_timings:        eval time =  6079.19 ms /    32 runs   (  189.97 ms per run)
llama_print_timings:       total time = 51473.49 ms
## Done ##


Thanks,

Robin

same here

Same
It sees the requests but there's no indication whether the model is being used or not

well clearly from the op post you can see that the post req was sent, and recieved:


 [end of text]

llama_print_timings:        load time = 41434.44 ms
llama_print_timings:      sample time =    16.06 ms /    33 runs   (    0.49 ms per run)
llama_print_timings: prompt eval time =  4391.98 ms /    74 tokens (   59.35 ms per token)
llama_print_timings:        eval time =  6079.19 ms /    32 runs   (  189.97 ms per run)
llama_print_timings:       total time = 51473.49 ms
## Done ##

.. sometimes it works... sometimes..
try different model

I just want to add that this was happening to me, and apparently it was because I had no model loaded. I missed the "Apply changes" button on the settings page. I'm not sure why it was generating an empty response instead of showing the "no model has been loaded" notification.