Canot run outlines with Ollama via OpenAI compatible server

Question

Canot run outlines with Ollama via OpenAI compatible server

Closed this issue 2 months ago · 7 comments

Describe the issue as clearly as possible:

OpenAI errors out when trying to set tokenizer.

Steps/code to reproduce the bug:

# must have ollama running
 
import tiktoken
from outlines import generate, models
 
model = models.openai(
    "phi3.5",
    base_url="http://localhost:11434/v1",
    api_key="ollama",
    tokenizer=tiktoken.get_encoding("cl100k_base"),
)
 
generator = generate.text(model)
 
result = generator("Question: What's 2+2? Answer:", max_tokens=100)
print(result)

Expected result:

LLM generation

Error message:

TypeError: AsyncOpenAI.__init__() got an unexpected keyword argument 'tokenizer'

Outlines/Python version information:

Version information

(.env) [duarteocarmo:~/Repos/tricount]$ python -c "from outlines import _version; print(_version.version)" (main✱)
0.0.46
(.env) [duarteocarmo:~/Repos/tricount]$ python -c "import sys; print('Python', sys.version)" (main✱)
Python 3.12.4 (main, Jun 14 2024, 09:57:09) [Clang 15.0.0 (clang-1500.3.9.4)]
(.env) [duarteocarmo:~/Repos/tricount]$ pip freez (main✱)
ERROR: unknown command "freez" - maybe you meant "freeze"
(.env) [duarteocarmo:~/Repos/tricount]$ pip freeze (main✱)
aiohappyeyeballs==2.4.0
aiohttp==3.10.5
aiosignal==1.3.1
annotated-types==0.7.0
anyio==4.4.0
attrs==24.2.0
certifi==2024.8.30
charset-normalizer==3.3.2
cloudpickle==3.0.0
datasets==2.21.0
dill==0.3.8
diskcache==5.6.3
distro==1.9.0
filelock==3.15.4
frozenlist==1.4.1
fsspec==2024.6.1
h11==0.14.0
httpcore==1.0.5
httpx==0.27.2
huggingface-hub==0.24.6
idna==3.8
interegular==0.3.3
Jinja2==3.1.4
jiter==0.5.0
jsonschema==4.23.0
jsonschema-specifications==2023.12.1
lark==1.2.2
llvmlite==0.43.0
MarkupSafe==2.1.5
multidict==6.0.5
multiprocess==0.70.16
nest-asyncio==1.6.0
numba==0.60.0
numpy==1.26.4
ollama==0.3.2
openai==1.43.1
outlines==0.0.46
packaging==24.1
pandas==2.2.2
pillow==10.4.0
pyairports==2.1.1
pyarrow==17.0.0
pycountry==24.6.1
pydantic==2.9.0
pydantic_core==2.23.2
python-dateutil==2.9.0.post0
pytz==2024.1
PyYAML==6.0.2
referencing==0.35.1
regex==2024.7.24
requests==2.32.3
rpds-py==0.20.0
setuptools==74.1.2
six==1.16.0
sniffio==1.3.1
tiktoken==0.7.0
tqdm==4.66.5
typing_extensions==4.12.2
tzdata==2024.1
urllib3==2.2.2
xxhash==3.5.0
yarl==1.9.11

Context for the issue:

No response nooo

Answer 1 · 2024-09-14T19:15:42.000Z

What happens when you don't pass the tokenizer? The endpoint should be returning text so a tokenizer shouldn't be necessary.

Answer 2 · 2024-09-15T14:43:40.000Z

@lapp0:

# must have ollama running

import tiktoken
from outlines import generate, models

model = models.openai(
    "phi3.5",
    base_url="http://localhost:11434/v1",
    api_key="ollama",
)

generator = generate.text(model)

result = generator("Question: What's 2+2? Answer:", max_tokens=100)
print(result)

Error:

KeyError: 'Could not automatically map phi3.5 to a tokeniser. Please use `tiktoken.get_encoding` to explicitly get the tokeniser you expect.'

Answer 3 · 2024-09-15T17:04:30.000Z

@duarteocarmo thanks for pointing this out to me.

I've removed the unnecessary tokenizer requirement. Could you please try

pip install --force-upgrade git+https://github.com/lapp0/outlines@openai-structured-generation

Answer 4 · 2024-09-15T19:05:28.000Z

@lapp0 - that solves it! Thanks for the help :)

Answer 5 · 2024-09-15T19:14:57.000Z

Great news, glad to help!

Answer 6 · 2024-11-02T13:22:14.000Z

Are you able to get a proper JSON response?
I am using ollama and llama 3.1 8B model .
I keep getting validation errors.

Answer 7 · 2024-11-02T17:14:46.000Z

@Pavel401 Could you please share reproduction steps in a separate issue and include your full version information via pip freeze