huggingface/tgi-gaudi

Integrate critical PR from TGI uptream

luoyiroy opened this issue · 4 comments

Feature request

Using tgi-gaudi to inference llama3 will produce some special token used with Meta Llama 3.
For example, here is the sample output from curl test:

data:{"id":"","object":"text_completion","created":1717513572,"model":"/mnt/models","system_fingerprint":"2.0.0-native","choices":[{"index":0,"delta":{"role":"assistant","content":"<|eot_id|>"},"logprobs":null,"finish_reason":"eos_token"}]}

Could you please help to rebase and merge with the critical PR huggingface#1808 for support special token such like '<|eot_id|>' to support Llama 3?

Motivation

NONE

Your contribution

You can review the upstream PR: huggingface#1808

Thanks @luoyiroy, we are planning to rebase with the TGI 2.0.2 soon (PR mentioned by you is a part of TGI 2.0.2).
I will start the rebase after we will merge TGI 2.0.1: #154

Hi @kdamaszk, noticed TGI 2.0.1 merged. Do you have an ETA for TGI 2.0.2?
thanks.
-Yang

Please check #158

PR #158 is merged into habana-main branch. Closing this issue