xusenlinzy/api-for-open-llm

【embedding】是不支持最新的SOTA模型吗 ?KeyError: 'Could not automatically map text2vec-base-multilingual to a tokeniser.

Closed this issue · 2 comments

提交前必须检查以下项目 | The following items must be checked before submission

  • 请确保使用的是仓库最新代码(git pull),一些问题已被解决和修复。 | Make sure you are using the latest code from the repository (git pull), some issues have already been addressed and fixed.
  • 我已阅读项目文档FAQ章节并且已在Issue中对问题进行了搜索,没有找到相似问题和解决方案 | I have searched the existing issues / discussions

问题类型 | Type of problem

模型推理和部署 | Model inference and deployment

操作系统 | Operating system

Linux

详细描述问题 | Detailed description of the problem

# 请在此处粘贴运行代码(如没有可删除该代码块)
# Paste the runtime code here (delete the code block if you don't have it)

Dependencies

# 请在此处粘贴依赖情况
# Please paste the dependencies here

运行日志或截图 | Runtime logs or screenshots

INFO:     119.3.147.8:46812 - "POST /v1/embeddings HTTP/1.1" 500 Internal Server Error
ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/data/luocheng/anaconda3/envs/openai/lib/python3.10/site-packages/uvicorn/protocols/http/httptools_impl.py", line 399, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
  File "/data/luocheng/anaconda3/envs/openai/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 70, in __call__
    return await self.app(scope, receive, send)
  File "/data/luocheng/anaconda3/envs/openai/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in __call__
    await super().__call__(scope, receive, send)
  File "/data/luocheng/anaconda3/envs/openai/lib/python3.10/site-packages/starlette/applications.py", line 123, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/data/luocheng/anaconda3/envs/openai/lib/python3.10/site-packages/starlette/middleware/errors.py", line 186, in __call__
    raise exc
  File "/data/luocheng/anaconda3/envs/openai/lib/python3.10/site-packages/starlette/middleware/errors.py", line 164, in __call__
    await self.app(scope, receive, _send)
  File "/data/luocheng/anaconda3/envs/openai/lib/python3.10/site-packages/starlette/middleware/cors.py", line 85, in __call__
    await self.app(scope, receive, send)
  File "/data/luocheng/anaconda3/envs/openai/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 65, in __call__
    await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
  File "/data/luocheng/anaconda3/envs/openai/lib/python3.10/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "/data/luocheng/anaconda3/envs/openai/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "/data/luocheng/anaconda3/envs/openai/lib/python3.10/site-packages/starlette/routing.py", line 756, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/data/luocheng/anaconda3/envs/openai/lib/python3.10/site-packages/starlette/routing.py", line 776, in app
    await route.handle(scope, receive, send)
  File "/data/luocheng/anaconda3/envs/openai/lib/python3.10/site-packages/starlette/routing.py", line 297, in handle
    await self.app(scope, receive, send)
  File "/data/luocheng/anaconda3/envs/openai/lib/python3.10/site-packages/starlette/routing.py", line 77, in app
    await wrap_app_handling_exceptions(app, request)(scope, receive, send)
  File "/data/luocheng/anaconda3/envs/openai/lib/python3.10/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "/data/luocheng/anaconda3/envs/openai/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "/data/luocheng/anaconda3/envs/openai/lib/python3.10/site-packages/starlette/routing.py", line 72, in app
    response = await func(request)
  File "/data/luocheng/anaconda3/envs/openai/lib/python3.10/site-packages/fastapi/routing.py", line 278, in app
    raw_response = await run_endpoint_function(
  File "/data/luocheng/anaconda3/envs/openai/lib/python3.10/site-packages/fastapi/routing.py", line 191, in run_endpoint_function
    return await dependant.call(**values)
  File "/data/luocheng/graph/api-for-open-llm/api/routes/embedding.py", line 42, in create_embeddings
    decoding = tiktoken.model.encoding_for_model(request.model)
  File "/data/luocheng/anaconda3/envs/openai/lib/python3.10/site-packages/tiktoken/model.py", line 103, in encoding_for_model
    return get_encoding(encoding_name_for_model(model_name))
  File "/data/luocheng/anaconda3/envs/openai/lib/python3.10/site-packages/tiktoken/model.py", line 90, in encoding_name_for_model
    raise KeyError(
KeyError: 'Could not automatically map text2vec-base-multilingual to a tokeniser. Please use `tiktoken.get_encoding` to explicitly get the tokeniser you expect.'

包括readme上的几个都不行,示例embedding也报这个错误,是这个项目不支持embedding吗?

把代码
OpenAIEmbeddings(openai_api_base="http://localhost:8000/v1", openai_api_key="bge-base-zh", model="text2vec-base-multilingual")
改成
OpenAIEmbeddings(openai_api_base="http://localhost:8000/v1", openai_api_key="bge-base-zh", model="text-embedding-ada-002")
解决问题