The replied from chat, continue run and show all the sample response

Question

The replied from chat, continue run and show all the sample response

chongy076 opened this issue a year ago · 2 comments

chongy076 commented a year ago

Expected Behavior

Show stop generating from sample and removed from replied message to user.

Current Behavior

it is continue generating and

Steps to Reproduce

Please provide detailed steps to reproduce the issue.

Step 1 - load model ggml-vicuna-13b-4bit-rev1.bin
Step 2- start the chat

Possible Solution

The new committed code at the 8 hours ago was working fine also generate faster, and the latest code seem removed many heavy code,
May need to take a look at from pyllamacpp.model import Model for code generation , previously was using self.cancel_gen=True to stop the generation. but notice , at the backend it is still running even cancel was called.

But at least on front end code stopped the hallucinations code from appeared at user end.
The best would be at stop in model generation itself. it loaded large result which not required and consumed a lot of time,
unless the generation was broken into batched. otherwise the return may not able to stop the thread in the model.
for tok in self.model.generate(prompt,
n_predict=n_predict,
temp=self.config['temp'],
top_k=self.config['top_k'],
top_p=self.config['top_p'],
repeat_penalty=self.config['repeat_penalty'],
repeat_last_n = self.config['repeat_last_n'],
n_threads=self.config['n_threads'],
):
if not new_text_callback(tok):
return

Context

Please provide any additional context about the issue.

Screenshots

If applicable, add screenshots to help explain the issue.

Answer 1 · 2023-05-01T12:02:46.000Z

The last version does stop every thing if you use the llamacpp backend. you have to update the project so that it installs the latest version of the backend requirements. pip install -r requirements.txt

Answer 2 · 2023-05-01T13:40:21.000Z

The last version does stop every thing if you use the llamacpp backend. you have to update the project so that it installs the latest version of the backend requirements. pip install -r requirements.txt

Hi , i will try again. At the same time. I am looking at pyllamacpp. it has the antiprompt, and some other parameters.
could you check the pull i did today, the code you updated on wsgi server, to run smoothly it can http_server can serve directly without needed to use socketio,run.

Thanks

[here the testcode i tried. on antiprompt]
#from pyllamacpp.model import Model

#model = Model(ggml_model='./models/llama_cpp/ggml-vicuna-13b-4bit-rev1.bin')
#for token in model.generate("hello"):

print(token, end='')

from pyllamacpp.model import Model

prompt_context = """ Act as ### Assistant:. ### Assistant: is helpful, kind, honest, good at writing, and never fails to answer the User's requests immediately and with precision. To do this, ### Assistant: uses a database of information collected from many different sources, including books, journals, online articles, and more. Stop generate after ### Assistant:

Human: Nice to meet you Bob!

Assistant: Welcome! I'm here to assist you with anything you need. What can I do for you today?

"""

prompt_prefix = "### Human:"
prompt_suffix = "### Assistant:"

smodel='./models/llama_cpp/ggml-vicuna-13b-4bit-rev1.bin'
model = Model(ggml_model=smodel, n_ctx=512, prompt_context=prompt_context, prompt_prefix=prompt_prefix,
prompt_suffix=prompt_suffix,anti_prompts=[prompt_prefix] )

while True:
try:
bStart=False
prompt = input(prompt_prefix)
if prompt == '':
continue
print(prompt_suffix , end='')
for tok in model.generate(prompt,n_predict=300):
if prompt_suffix in tok:
bStart=True
print("[Start]")
if bStart==True and prompt_prefix in tok:
print("[terminate]")
break
print(f"{tok}", end='', flush=True)
print()
except KeyboardInterrupt:
break

[result from model]

Human:list 2 pain killer medicine

Assistant: Here are two common painkiller medicines:

Aspirin - a nonsteroidal anti-inflammatory drug (NSAID) used to relieve minor aches and pains, such as headaches, menstrual cramps, arthritis, toothaches, and the common cold.
Acetaminophen (also known as paracetamol) - a pharmacological agent used for the relief of fe

Human:

hahaha , this is called a box ...... :D