generate <unk><unk><unk><unk><unk><unk><unk><unk>

Question

generate <unk><unk><unk><unk><unk><unk><unk><unk>

louwangzhiyuY opened this issue 9 months ago · 4 comments

python console_chat.py
Loading checkpoint shards: 100%|████████████████████████████████████████| 2/2 [02:37<00:00, 78.94s/it]
Number of GPUs available: 1
Model ../model-cache/mistralai/Mistral-7B loaded successfully on cuda
Enter your text (type #end to stop): What is captial of canada?
~~### Text: What is captial of canada?~~

~~The tone is:~~

Answer 1 · 2023-12-21T23:25:58.000Z

If the console chat is being used the answer is totally different than gradio chat:

Number of GPUs available: 1
Model ../model-cache/mistralai/Mistral-7B loaded successfully on cuda
Enter your text (type #end to stop): What is capital of canada?
<s> ### Text: What is capital of canada?
### The tone is:
surprise </s>
Enter your text (type #end to stop):

and in browser:

Both get the info from Model ../model-cache/mistralai/Mistral-7B. Why the different answers!?

Answer 2 · 2024-01-03T21:24:15.000Z

Hi @elsaco I cannot reproduce your results, got same inferencing results from both console and gradio.

It seems to me that your gradio demo only runs the base model.
Could you check if the adapter does exist after fine-tuning and is loaded correctly from line 22 - 41 in gradio_chat.py?

Answer 3 · 2024-01-04T20:50:38.000Z

The adapter exist but all is returned is surprise

Output after mistral-7b finetuning:

The output is the same using phi-2 model, so it might be a gradio-chat issue.

Answer 4 · 2024-01-04T20:56:05.000Z

Ok, so the E2E fine-tuning and inferencing workflow should work in your setup.

The inferencing result may be random because the adapter is trained on a small, toy dataset for demonstration purpose only.