BerriAI/liteLLM-proxy

Issue with docker image and calls

Closed this issue · 6 comments

Hi

I have been using the following docker file for start up with issue but just started getting problems

services:

  tgi:
    image: ghcr.io/huggingface/text-generation-inference:1.2
    command: --model-id TheBloke/zephyr-7B-beta-AWQ --max-batch-prefill-tokens 2048 --quantize awq
    volumes:
      - ./models:/data
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

  llm-api:
    image: ghcr.io/berriai/litellm:main-v1.10.3
    
    command:
      - /bin/sh
      - -c
      - |
        pip install async_generator
        litellm --model huggingface/TheBloke/zephyr-7B-beta-AWQ --api_base http://tgi/generate_stream --host 0.0.0.0 --port 3000
    entrypoint: []
    ports:
      - 7766:3000    
    platform: linux/amd64

I do the following curl call

curl http://localhost:7766/v1/chat/completions    -H "Content-Type: application/json"   -d '{  "model": "zephyr-7B-beta-AWQ", "messages": [{"role": "user", "content": "what is the capital of england"}]}'

and get the following. While the question gets answered at the end of the response it also throws an error - "detail":"HuggingfaceException - Expecting value: line 1 column 1 (char 0)

{"detail":"HuggingfaceException - Expecting value: line 1 column 1 (char 0)\n\nOriginal Response: data:{\"token\":{\"id\":28789,\"text\":\"<\",\"logprob\":-0.0015649796,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28766,\"text\":\"|\",\"logprob\":-0.0000034570694,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":489,\"text\":\"ass\",\"logprob\":-0.0000010728836,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":11143,\"text\":\"istant\",\"logprob\":-0.0000023841858,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28766,\"text\":\"|\",\"logprob\":-8.34465e-7,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28767,\"text\":\">\",\"logprob\":-0.0000011920929,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":13,\"text\":\"\\n\",\"logprob\":-0.00062561035,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":1014,\"text\":\"The\",\"logprob\":-0.24304199,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":5565,\"text\":\" capital\",\"logprob\":-0.023544312,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":302,\"text\":\" of\",\"logprob\":-0.3972168,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":5783,\"text\":\" England\",\"logprob\":-0.0012617111,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":349,\"text\":\" is\",\"logprob\":-0.0013980865,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":4222,\"text\":\" London\",\"logprob\":-0.019927979,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28723,\"text\":\".\",\"logprob\":-0.19641113,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":2993,\"text\":\" However\",\"logprob\":-0.26660156,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28725,\"text\":\",\",\"logprob\":-0.000010609627,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":5783,\"text\":\" England\",\"logprob\":-0.46411133,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":349,\"text\":\" is\",\"logprob\":-0.1472168,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":264,\"text\":\" a\",\"logprob\":-1.1308594,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":14769,\"text\":\" constitu\",\"logprob\":-0.054351807,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":308,\"text\":\"ent\",\"logprob\":-0.0000051259995,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":2939,\"text\":\" country\",\"logprob\":-0.0051498413,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":2373,\"text\":\" within\",\"logprob\":-0.51416016,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":272,\"text\":\" the\",\"logprob\":-0.000021338463,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":2969,\"text\":\" United\",\"logprob\":-0.017425537,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":11508,\"text\":\" Kingdom\",\"logprob\":-0.000008225441,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28725,\"text\":\",\",\"logprob\":-0.024337769,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":304,\"text\":\" and\",\"logprob\":-0.37060547,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":272,\"text\":\" the\",\"logprob\":-0.40722656,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":2969,\"text\":\" United\",\"logprob\":-1.0527344,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":11508,\"text\":\" Kingdom\",\"logprob\":-0.000028848648,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":390,\"text\":\" as\",\"logprob\":-1.0693359,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":264,\"text\":\" a\",\"logprob\":-0.00016498566,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":2894,\"text\":\" whole\",\"logprob\":-0.000089645386,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":1235,\"text\":\" does\",\"logprob\":-0.38671875,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":459,\"text\":\" not\",\"logprob\":-0.00006234646,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":506,\"text\":\" have\",\"logprob\":-0.0008454323,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":264,\"text\":\" a\",\"logprob\":-0.0059127808,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":5565,\"text\":\" capital\",\"logprob\":-0.03540039,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":2990,\"text\":\" city\",\"logprob\":-0.23498535,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":390,\"text\":\" as\",\"logprob\":-1.1435547,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":1259,\"text\":\" such\",\"logprob\":-0.14331055,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28723,\"text\":\".\",\"logprob\":-0.3293457,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":415,\"text\":\" The\",\"logprob\":-0.6118164,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":6194,\"text\":\" UK\",\"logprob\":-0.96777344,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28742,\"text\":\"'\",\"logprob\":-0.39111328,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28713,\"text\":\"s\",\"logprob\":-0.0000022649765,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":799,\"text\":\" other\",\"logprob\":-0.39526367,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":14769,\"text\":\" constitu\",\"logprob\":-0.02331543,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":308,\"text\":\"ent\",\"logprob\":-0.000024080276,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":5780,\"text\":\" countries\",\"logprob\":-0.00116539,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":460,\"text\":\" are\",\"logprob\":-0.3400879,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":14322,\"text\":\" Scotland\",\"logprob\":-0.033111572,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28725,\"text\":\",\",\"logprob\":-0.0056495667,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":14831,\"text\":\" Wales\",\"logprob\":-0.0023822784,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28725,\"text\":\",\",\"logprob\":-0.0059928894,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":304,\"text\":\" and\",\"logprob\":-0.00001168251,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":12781,\"text\":\" Northern\",\"logprob\":-0.00004386902,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":11170,\"text\":\" Ireland\",\"logprob\":-0.0000026226044,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28725,\"text\":\",\",\"logprob\":-0.14685059,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":1430,\"text\":\" each\",\"logprob\":-0.26757812,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":395,\"text\":\" with\",\"logprob\":-0.05065918,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":871,\"text\":\" its\",\"logprob\":-0.4020996,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":1216,\"text\":\" own\",\"logprob\":-0.007835388,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":5565,\"text\":\" capital\",\"logprob\":-0.027450562,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":2990,\"text\":\" city\",\"logprob\":-0.07928467,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28747,\"text\":\":\",\"logprob\":-0.6953125,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":25970,\"text\":\" Edinburgh\",\"logprob\":-0.00020945072,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28725,\"text\":\",\",\"logprob\":-0.011047363,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":8564,\"text\":\" Card\",\"logprob\":-0.00029182434,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":2728,\"text\":\"iff\",\"logprob\":-0.00000333786,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28725,\"text\":\",\",\"logprob\":-0.00059080124,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":304,\"text\":\" and\",\"logprob\":-0.0000019073486,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":365,\"text\":\" B\",\"logprob\":-0.000027298927,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":599,\"text\":\"elf\",\"logprob\":-2.3841858e-7,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":529,\"text\":\"ast\",\"logprob\":-0.000002026558,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28725,\"text\":\",\",\"logprob\":-0.04925537,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":8628,\"text\":\" respectively\",\"logprob\":-0.000021219254,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28723,\"text\":\".\",\"logprob\":-0.0000834465,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":2,\"text\":\"</s>\",\"logprob\":-0.18835449,\"special\":true},\"generated_text\":\"<|assistant|>\\nThe capital of England is London. However, England is a constituent country within the United Kingdom, and the United Kingdom as a whole does not have a capital city as such. The UK's other constituent countries are Scotland, Wales, and Northern Ireland, each with its own capital city: Edinburgh, Cardiff, and Belfast, respectively.\",\"details\":{\"finish_reason\":\"eos_token\",\"generated_tokens\":80,\"seed\":null}}\n\n\n\nTraceback (most recent call last):\n  File \"/usr/local/lib/python3.9/site-packages/litellm/llms/huggingface_restapi.py\", line 456, in acompletion\n    response_json = response.json()\n  File \"/usr/local/lib/python3.9/site-packages/httpx/_models.py\", line 761, in json\n    return jsonlib.loads(self.content, **kwargs)\n  File \"/usr/local/lib/python3.9/json/__init__.py\", line 346, in loads\n    return _default_decoder.decode(s)\n  File \"/usr/local/lib/python3.9/json/decoder.py\", line 337, in decode\n    obj, end = self.raw_decode(s, idx=_w(s, 0).end())\n  File \"/usr/local/lib/python3.9/json/decoder.py\", line 355, in raw_decode\n    raise JSONDecodeError(\"Expecting value\", s, err.value) from None\njson.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n  File \"/usr/local/lib/python3.9/site-packages/litellm/main.py\", line 185, in acompletion\n    response = await init_response\n  File \"/usr/local/lib/python3.9/site-packages/litellm/llms/huggingface_restapi.py\", line 472, in acompletion\n    raise HuggingfaceError(status_code=500, message=f\"{str(e)}\\n\\nOriginal Response: {response.text}\")\nlitellm.llms.huggingface_restapi.HuggingfaceError: Expecting value: line 1 column 1 (char 0)\n\nOriginal Response: data:{\"token\":{\"id\":28789,\"text\":\"<\",\"logprob\":-0.0015649796,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28766,\"text\":\"|\",\"logprob\":-0.0000034570694,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":489,\"text\":\"ass\",\"logprob\":-0.0000010728836,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":11143,\"text\":\"istant\",\"logprob\":-0.0000023841858,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28766,\"text\":\"|\",\"logprob\":-8.34465e-7,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28767,\"text\":\">\",\"logprob\":-0.0000011920929,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":13,\"text\":\"\\n\",\"logprob\":-0.00062561035,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":1014,\"text\":\"The\",\"logprob\":-0.24304199,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":5565,\"text\":\" capital\",\"logprob\":-0.023544312,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":302,\"text\":\" of\",\"logprob\":-0.3972168,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":5783,\"text\":\" England\",\"logprob\":-0.0012617111,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":349,\"text\":\" is\",\"logprob\":-0.0013980865,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":4222,\"text\":\" London\",\"logprob\":-0.019927979,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28723,\"text\":\".\",\"logprob\":-0.19641113,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":2993,\"text\":\" However\",\"logprob\":-0.26660156,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28725,\"text\":\",\",\"logprob\":-0.000010609627,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":5783,\"text\":\" England\",\"logprob\":-0.46411133,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":349,\"text\":\" is\",\"logprob\":-0.1472168,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":264,\"text\":\" a\",\"logprob\":-1.1308594,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":14769,\"text\":\" constitu\",\"logprob\":-0.054351807,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":308,\"text\":\"ent\",\"logprob\":-0.0000051259995,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":2939,\"text\":\" country\",\"logprob\":-0.0051498413,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":2373,\"text\":\" within\",\"logprob\":-0.51416016,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":272,\"text\":\" the\",\"logprob\":-0.000021338463,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":2969,\"text\":\" United\",\"logprob\":-0.017425537,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":11508,\"text\":\" Kingdom\",\"logprob\":-0.000008225441,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28725,\"text\":\",\",\"logprob\":-0.024337769,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":304,\"text\":\" and\",\"logprob\":-0.37060547,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":272,\"text\":\" the\",\"logprob\":-0.40722656,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":2969,\"text\":\" United\",\"logprob\":-1.0527344,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":11508,\"text\":\" Kingdom\",\"logprob\":-0.000028848648,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":390,\"text\":\" as\",\"logprob\":-1.0693359,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":264,\"text\":\" a\",\"logprob\":-0.00016498566,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":2894,\"text\":\" whole\",\"logprob\":-0.000089645386,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":1235,\"text\":\" does\",\"logprob\":-0.38671875,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":459,\"text\":\" not\",\"logprob\":-0.00006234646,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":506,\"text\":\" have\",\"logprob\":-0.0008454323,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":264,\"text\":\" a\",\"logprob\":-0.0059127808,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":5565,\"text\":\" capital\",\"logprob\":-0.03540039,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":2990,\"text\":\" city\",\"logprob\":-0.23498535,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":390,\"text\":\" as\",\"logprob\":-1.1435547,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":1259,\"text\":\" such\",\"logprob\":-0.14331055,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28723,\"text\":\".\",\"logprob\":-0.3293457,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":415,\"text\":\" The\",\"logprob\":-0.6118164,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":6194,\"text\":\" UK\",\"logprob\":-0.96777344,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28742,\"text\":\"'\",\"logprob\":-0.39111328,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28713,\"text\":\"s\",\"logprob\":-0.0000022649765,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":799,\"text\":\" other\",\"logprob\":-0.39526367,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":14769,\"text\":\" constitu\",\"logprob\":-0.02331543,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":308,\"text\":\"ent\",\"logprob\":-0.000024080276,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":5780,\"text\":\" countries\",\"logprob\":-0.00116539,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":460,\"text\":\" are\",\"logprob\":-0.3400879,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":14322,\"text\":\" Scotland\",\"logprob\":-0.033111572,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28725,\"text\":\",\",\"logprob\":-0.0056495667,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":14831,\"text\":\" Wales\",\"logprob\":-0.0023822784,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28725,\"text\":\",\",\"logprob\":-0.0059928894,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":304,\"text\":\" and\",\"logprob\":-0.00001168251,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":12781,\"text\":\" Northern\",\"logprob\":-0.00004386902,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":11170,\"text\":\" Ireland\",\"logprob\":-0.0000026226044,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28725,\"text\":\",\",\"logprob\":-0.14685059,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":1430,\"text\":\" each\",\"logprob\":-0.26757812,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":395,\"text\":\" with\",\"logprob\":-0.05065918,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":871,\"text\":\" its\",\"logprob\":-0.4020996,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":1216,\"text\":\" own\",\"logprob\":-0.007835388,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":5565,\"text\":\" capital\",\"logprob\":-0.027450562,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":2990,\"text\":\" city\",\"logprob\":-0.07928467,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28747,\"text\":\":\",\"logprob\":-0.6953125,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":25970,\"text\":\" Edinburgh\",\"logprob\":-0.00020945072,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28725,\"text\":\",\",\"logprob\":-0.011047363,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":8564,\"text\":\" Card\",\"logprob\":-0.00029182434,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":2728,\"text\":\"iff\",\"logprob\":-0.00000333786,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28725,\"text\":\",\",\"logprob\":-0.00059080124,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":304,\"text\":\" and\",\"logprob\":-0.0000019073486,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":365,\"text\":\" B\",\"logprob\":-0.000027298927,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":599,\"text\":\"elf\",\"logprob\":-2.3841858e-7,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":529,\"text\":\"ast\",\"logprob\":-0.000002026558,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28725,\"text\":\",\",\"logprob\":-0.04925537,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":8628,\"text\":\" respectively\",\"logprob\":-0.000021219254,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28723,\"text\":\".\",\"logprob\":-0.0000834465,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":2,\"text\":\"</s>\",\"logprob\":-0.18835449,\"special\":true},\"generated_text\":\"<|assistant|>\\nThe capital of England is London. However, England is a constituent country within the United Kingdom, and the United Kingdom as a whole does not have a capital city as such. The UK's other constituent countries are Scotland, Wales, and Northern Ireland, each with its own capital city: Edinburgh, Cardiff, and Belfast, respectively.\",\"details\":{\"finish_reason\":\"eos_token\",\"generated_tokens\":80,\"seed\":null}}\n\n\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n  File \"/usr/local/lib/python3.9/site-packages/litellm/proxy/proxy_server.py\", line 873, in chat_completion\n    response = await litellm.acompletion(**data)\n  File \"/usr/local/lib/python3.9/site-packages/litellm/utils.py\", line 1465, in wrapper_async\n    raise e\n  File \"/usr/local/lib/python3.9/site-packages/litellm/utils.py\", line 1411, in wrapper_async\n    result = await original_function(*args, **kwargs)\n  File \"/usr/local/lib/python3.9/site-packages/litellm/main.py\", line 195, in acompletion\n    raise exception_type(\n  File \"/usr/local/lib/python3.9/site-packages/litellm/utils.py\", line 4634, in exception_type\n    raise e\n  File \"/usr/local/lib/python3.9/site-packages/litellm/utils.py\", line 4199, in exception_type\n    raise APIError(\nlitellm.exceptions.APIError: HuggingfaceException - Expecting value: line 1 column 1 (char 0)\n\nOriginal Response: data:{\"token\":{\"id\":28789,\"text\":\"<\",\"logprob\":-0.0015649796,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28766,\"text\":\"|\",\"logprob\":-0.0000034570694,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":489,\"text\":\"ass\",\"logprob\":-0.0000010728836,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":11143,\"text\":\"istant\",\"logprob\":-0.0000023841858,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28766,\"text\":\"|\",\"logprob\":-8.34465e-7,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28767,\"text\":\">\",\"logprob\":-0.0000011920929,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":13,\"text\":\"\\n\",\"logprob\":-0.00062561035,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":1014,\"text\":\"The\",\"logprob\":-0.24304199,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":5565,\"text\":\" capital\",\"logprob\":-0.023544312,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":302,\"text\":\" of\",\"logprob\":-0.3972168,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":5783,\"text\":\" England\",\"logprob\":-0.0012617111,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":349,\"text\":\" is\",\"logprob\":-0.0013980865,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":4222,\"text\":\" London\",\"logprob\":-0.019927979,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28723,\"text\":\".\",\"logprob\":-0.19641113,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":2993,\"text\":\" However\",\"logprob\":-0.26660156,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28725,\"text\":\",\",\"logprob\":-0.000010609627,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":5783,\"text\":\" England\",\"logprob\":-0.46411133,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":349,\"text\":\" is\",\"logprob\":-0.1472168,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":264,\"text\":\" a\",\"logprob\":-1.1308594,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":14769,\"text\":\" constitu\",\"logprob\":-0.054351807,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":308,\"text\":\"ent\",\"logprob\":-0.0000051259995,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":2939,\"text\":\" country\",\"logprob\":-0.0051498413,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":2373,\"text\":\" within\",\"logprob\":-0.51416016,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":272,\"text\":\" the\",\"logprob\":-0.000021338463,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":2969,\"text\":\" United\",\"logprob\":-0.017425537,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":11508,\"text\":\" Kingdom\",\"logprob\":-0.000008225441,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28725,\"text\":\",\",\"logprob\":-0.024337769,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":304,\"text\":\" and\",\"logprob\":-0.37060547,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":272,\"text\":\" the\",\"logprob\":-0.40722656,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":2969,\"text\":\" United\",\"logprob\":-1.0527344,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":11508,\"text\":\" Kingdom\",\"logprob\":-0.000028848648,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":390,\"text\":\" as\",\"logprob\":-1.0693359,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":264,\"text\":\" a\",\"logprob\":-0.00016498566,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":2894,\"text\":\" whole\",\"logprob\":-0.000089645386,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":1235,\"text\":\" does\",\"logprob\":-0.38671875,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":459,\"text\":\" not\",\"logprob\":-0.00006234646,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":506,\"text\":\" have\",\"logprob\":-0.0008454323,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":264,\"text\":\" a\",\"logprob\":-0.0059127808,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":5565,\"text\":\" capital\",\"logprob\":-0.03540039,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":2990,\"text\":\" city\",\"logprob\":-0.23498535,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":390,\"text\":\" as\",\"logprob\":-1.1435547,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":1259,\"text\":\" such\",\"logprob\":-0.14331055,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28723,\"text\":\".\",\"logprob\":-0.3293457,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":415,\"text\":\" The\",\"logprob\":-0.6118164,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":6194,\"text\":\" UK\",\"logprob\":-0.96777344,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28742,\"text\":\"'\",\"logprob\":-0.39111328,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28713,\"text\":\"s\",\"logprob\":-0.0000022649765,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":799,\"text\":\" other\",\"logprob\":-0.39526367,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":14769,\"text\":\" constitu\",\"logprob\":-0.02331543,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":308,\"text\":\"ent\",\"logprob\":-0.000024080276,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":5780,\"text\":\" countries\",\"logprob\":-0.00116539,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":460,\"text\":\" are\",\"logprob\":-0.3400879,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":14322,\"text\":\" Scotland\",\"logprob\":-0.033111572,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28725,\"text\":\",\",\"logprob\":-0.0056495667,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":14831,\"text\":\" Wales\",\"logprob\":-0.0023822784,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28725,\"text\":\",\",\"logprob\":-0.0059928894,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":304,\"text\":\" and\",\"logprob\":-0.00001168251,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":12781,\"text\":\" Northern\",\"logprob\":-0.00004386902,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":11170,\"text\":\" Ireland\",\"logprob\":-0.0000026226044,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28725,\"text\":\",\",\"logprob\":-0.14685059,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":1430,\"text\":\" each\",\"logprob\":-0.26757812,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":395,\"text\":\" with\",\"logprob\":-0.05065918,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":871,\"text\":\" its\",\"logprob\":-0.4020996,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":1216,\"text\":\" own\",\"logprob\":-0.007835388,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":5565,\"text\":\" capital\",\"logprob\":-0.027450562,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":2990,\"text\":\" city\",\"logprob\":-0.07928467,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28747,\"text\":\":\",\"logprob\":-0.6953125,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":25970,\"text\":\" Edinburgh\",\"logprob\":-0.00020945072,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28725,\"text\":\",\",\"logprob\":-0.011047363,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":8564,\"text\":\" Card\",\"logprob\":-0.00029182434,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":2728,\"text\":\"iff\",\"logprob\":-0.00000333786,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28725,\"text\":\",\",\"logprob\":-0.00059080124,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":304,\"text\":\" and\",\"logprob\":-0.0000019073486,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":365,\"text\":\" B\",\"logprob\":-0.000027298927,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":599,\"text\":\"elf\",\"logprob\":-2.3841858e-7,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":529,\"text\":\"ast\",\"logprob\":-0.000002026558,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28725,\"text\":\",\",\"logprob\":-0.04925537,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":8628,\"text\":\" respectively\",\"logprob\":-0.000021219254,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":28723,\"text\":\".\",\"logprob\":-0.0000834465,\"special\":false},\"generated_text\":null,\"details\":null}\n\ndata:{\"token\":{\"id\":2,\"text\":\"</s>\",\"logprob\":-0.18835449,\"special\":true},\"generated_text\":\"<|assistant|>\\nThe capital of England is London. However, England is a constituent country within the United Kingdom, and the United Kingdom as a whole does not have a capital city as such. The UK's other constituent countries are Scotland, Wales, and Northern Ireland, each with its own capital city: Edinburgh, Cardiff, and Belfast, respectively.\",\"details\":{\"finish_reason\":\"eos_token\",\"generated_tokens\":80,\"seed\":null}}\n\n\n"}

I have tried with different models and still get the same.
I appreciate this is the latest docker image but this has been working for us hence why we kept it stable.

Hey @kulbinderdio are you saying you started seeing issues with the

ghcr.io/berriai/litellm:main-v1.10.3

?

or that when you upgraded to ghcr.io/berriai/litellm:main-v1.17.0 ?

The issue indicates an error in hf parsing, but we haven't made any changes to it.

@krrishdholakia
I really don't understand either as I have been using the same file for ages.
I did a complete clean up of docker images recently but again as this is tagged I wouldn't have expected any changes.
I do notice the follow during startup, don't know if this is related

bionic-llm-api-1  | Requirement already satisfied: async_generator in /usr/local/lib/python3.9/site-packages (1.10)
bionic-tgi-1      | 2024-01-10T17:01:40.736545Z  INFO text_generation_launcher: Args { model_id: "TheBloke/zephyr-7B-beta-AWQ", revision: None, validation_workers: 2, sharded: None, num_shard: None, quantize: Some(Awq), dtype: None, trust_remote_code: false, max_concurrent_requests: 128, max_best_of: 2, max_stop_sequences: 4, max_top_n_tokens: 5, max_input_length: 1024, max_total_tokens: 2048, waiting_served_ratio: 1.2, max_batch_prefill_tokens: 2048, max_batch_total_tokens: None, max_waiting_tokens: 20, hostname: "c6c7fc372236", port: 80, shard_uds_path: "/tmp/text-generation-server", master_addr: "localhost", master_port: 29500, huggingface_hub_cache: Some("/data"), weights_cache_override: None, disable_custom_kernels: false, cuda_memory_fraction: 1.0, rope_scaling: None, rope_factor: None, json_output: false, otlp_endpoint: None, cors_allow_origin: [], watermark_gamma: None, watermark_delta: None, ngrok: false, ngrok_authtoken: None, ngrok_edge: None, env: false }
bionic-tgi-1      | 2024-01-10T17:01:40.736608Z  INFO download: text_generation_launcher: Starting download process.
bionic-llm-api-1  | WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
bionic-llm-api-1  | 
bionic-llm-api-1  | [notice] A new release of pip is available: 23.0.1 -> 23.3.2
bionic-llm-api-1  | [notice] To update, run: pip install --upgrade pip
bionic-llm-api-1  | /usr/local/lib/python3.9/site-packages/pydantic/_internal/_fields.py:149: UserWarning: Field "model_name" has conflict with protected namespace "model_".
bionic-llm-api-1  | 
bionic-llm-api-1  | You may be able to resolve this warning by setting `model_config['protected_namespaces'] = ()`.
bionic-llm-api-1  |   warnings.warn(
bionic-llm-api-1  | /usr/local/lib/python3.9/site-packages/pydantic/_internal/_fields.py:149: UserWarning: Field "model_info" has conflict with protected namespace "model_".
bionic-llm-api-1  | 
bionic-llm-api-1  | You may be able to resolve this warning by setting `model_config['protected_namespaces'] = ()`.
bionic-llm-api-1  |   warnings.warn(
bionic-llm-api-1  | INFO:     Started server process [10]
bionic-llm-api-1  | INFO:     Waiting for application startup.
bionic-llm-api-1  | INFO:     Application startup complete.

from the logs

bionic-llm-api-1  | data:{"token":{"id":2,"text":"</s>","logprob":-1.0351562,"special":true},"generated_text":"<|assistant|>\nI am not a physical entity and do not have a language model (LM) associated with me. I am a virtual assistant powered by a pre-trained transformer-based LM, which allows me to understand and generate human-like responses to your queries. The specific LM used to train me is called GPT-3 (Generative Pre-trained Transformer 3), which is one of the most advanced and powerful LMs available today.","details":{"finish_reason":"eos_token","generated_tokens":100,"seed":null}}
bionic-llm-api-1  | 
bionic-llm-api-1  | 
bionic-llm-api-1  | 
bionic-llm-api-1  | During handling of the above exception, another exception occurred:
bionic-llm-api-1  | 
bionic-llm-api-1  | Traceback (most recent call last):
bionic-llm-api-1  |   File "/usr/local/lib/python3.9/site-packages/litellm/proxy/proxy_server.py", line 1464, in chat_completion
bionic-llm-api-1  |     response = await litellm.acompletion(**data)
bionic-llm-api-1  |   File "/usr/local/lib/python3.9/site-packages/litellm/utils.py", line 2366, in wrapper_async
bionic-llm-api-1  |     raise e
bionic-llm-api-1  |   File "/usr/local/lib/python3.9/site-packages/litellm/utils.py", line 2258, in wrapper_async
bionic-llm-api-1  |     result = await original_function(*args, **kwargs)
bionic-llm-api-1  |   File "/usr/local/lib/python3.9/site-packages/litellm/main.py", line 227, in acompletion
bionic-llm-api-1  |     raise exception_type(
bionic-llm-api-1  |   File "/usr/local/lib/python3.9/site-packages/litellm/utils.py", line 6628, in exception_type
bionic-llm-api-1  |     raise e
bionic-llm-api-1  |   File "/usr/local/lib/python3.9/site-packages/litellm/utils.py", line 6111, in exception_type
bionic-llm-api-1  |     raise APIError(
bionic-llm-api-1  | litellm.exceptions.APIError: HuggingfaceException - Expecting value: line 1 column 1 (char 0)

don't know if this adds anything extra

ionic-tgi-1      | 2024-01-11T11:27:50.154869Z  INFO generate_stream{parameters=GenerateParameters { best_of: None, temperature: None, repetition_penalty: None, top_k: None, top_p: None, typical_p: None, do_sample: false, max_new_tokens: None, return_full_text: Some(false), stop: [], truncate: None, watermark: false, details: true, decoder_input_details: false, seed: None, top_n_tokens: None } total_time="1.739647484s" validation_time="237.677µs" queue_time="24.533µs" inference_time="1.73938538s" time_per_token="17.393853ms" seed="None"}: text_generation_router::server: router/src/server.rs:457: Success
bionic-llm-api-1  | receiving data: {'model': 'zephyr-7B-beta-AWQ', 'messages': [{'role': 'user', 'content': 'what llm are you'}]}
bionic-llm-api-1  | litellm.cache: None
bionic-llm-api-1  | kwargs[caching]: False; litellm.cache: None
bionic-llm-api-1  | litellm.caching: False; litellm.caching_with_models: False; litellm.cache: None
bionic-llm-api-1  | kwargs[caching]: False; litellm.cache: None
bionic-llm-api-1  | 
bionic-llm-api-1  | LiteLLM completion() model= TheBloke/zephyr-7B-beta-AWQ; provider = huggingface
bionic-llm-api-1  | 
bionic-llm-api-1  | LiteLLM: Params passed to completion() {'functions': [], 'function_call': '', 'temperature': None, 'top_p': None, 'stream': None, 'max_tokens': None, 'presence_penalty': None, 'frequency_penalty': None, 'logit_bias': None, 'user': '', 'response_format': None, 'seed': None, 'tools': None, 'tool_choice': None, 'max_retries': None, 'custom_llm_provider': 'huggingface', 'model': 'TheBloke/zephyr-7B-beta-AWQ', 'n': None, 'stop': None}
bionic-llm-api-1  | 
bionic-llm-api-1  | LiteLLM: Non-Default params passed to completion() {}
bionic-llm-api-1  | self.optional_params: {}
bionic-llm-api-1  | TheBloke/zephyr-7B-beta-AWQ, text-generation-inference
bionic-llm-api-1  | PRE-API-CALL ADDITIONAL ARGS: {'complete_input_dict': {'inputs': '\n\n<|user|>\nwhat llm are you</s>\n\n\n', 'parameters': {'details': True, 'return_full_text': False}, 'stream': False}, 'task': 'text-generation-inference', 'headers': {'content-type': 'application/json'}, 'api_base': 'http://tgi/generate_stream', 'acompletion': True}
bionic-llm-api-1  | 
bionic-llm-api-1  | 
bionic-llm-api-1  | POST Request Sent from LiteLLM:
bionic-llm-api-1  | curl -X POST \
bionic-llm-api-1  | http://tgi/generate_stream \
bionic-llm-api-1  | -H 'content-type: application/json' \
bionic-llm-api-1  | -d '{'inputs': '\n\n<|user|>\nwhat llm are you</s>\n\n\n', 'parameters': {'details': True, 'return_full_text': False}, 'stream': False}'
bionic-llm-api-1  | 
bionic-llm-api-1  | 
bionic-llm-api-1  | Logging Details: logger_fn - None | callable(logger_fn) - False
bionic-llm-api-1  | Logging Details LiteLLM-Failure Call
bionic-llm-api-1  | An error occurred: HuggingfaceException - Expecting value: line 1 column 1 (char 0)

@kulbinderdio if you're using a tagged docker image, then it's not our code that changed, as it builds from source.

Seeing your call again - it looks like you're pointing the proxy to a /generate/stream endpoint but making a non-streaming call.

Can you point to http://tgi/ and see if that fixes things

or try making a streaming call and check if that works

@krrishdholakia thanks for this. I had completely missed this. Thanks a lot for your help
All working