Canot run outlines with Ollama via OpenAI compatible server
Closed this issue · 7 comments
duarteocarmo commented
Describe the issue as clearly as possible:
OpenAI errors out when trying to set tokenizer.
Steps/code to reproduce the bug:
# must have ollama running
import tiktoken
from outlines import generate, models
model = models.openai(
"phi3.5",
base_url="http://localhost:11434/v1",
api_key="ollama",
tokenizer=tiktoken.get_encoding("cl100k_base"),
)
generator = generate.text(model)
result = generator("Question: What's 2+2? Answer:", max_tokens=100)
print(result)
Expected result:
LLM generation
Error message:
TypeError: AsyncOpenAI.__init__() got an unexpected keyword argument 'tokenizer'
Outlines/Python version information:
Version information
(.env) [duarteocarmo:~/Repos/tricount]$ python -c "from outlines import _version; print(_version.version)" (main✱)
0.0.46
(.env) [duarteocarmo:~/Repos/tricount]$ python -c "import sys; print('Python', sys.version)" (main✱)
Python 3.12.4 (main, Jun 14 2024, 09:57:09) [Clang 15.0.0 (clang-1500.3.9.4)]
(.env) [duarteocarmo:~/Repos/tricount]$ pip freez (main✱)
ERROR: unknown command "freez" - maybe you meant "freeze"
(.env) [duarteocarmo:~/Repos/tricount]$ pip freeze (main✱)
aiohappyeyeballs==2.4.0
aiohttp==3.10.5
aiosignal==1.3.1
annotated-types==0.7.0
anyio==4.4.0
attrs==24.2.0
certifi==2024.8.30
charset-normalizer==3.3.2
cloudpickle==3.0.0
datasets==2.21.0
dill==0.3.8
diskcache==5.6.3
distro==1.9.0
filelock==3.15.4
frozenlist==1.4.1
fsspec==2024.6.1
h11==0.14.0
httpcore==1.0.5
httpx==0.27.2
huggingface-hub==0.24.6
idna==3.8
interegular==0.3.3
Jinja2==3.1.4
jiter==0.5.0
jsonschema==4.23.0
jsonschema-specifications==2023.12.1
lark==1.2.2
llvmlite==0.43.0
MarkupSafe==2.1.5
multidict==6.0.5
multiprocess==0.70.16
nest-asyncio==1.6.0
numba==0.60.0
numpy==1.26.4
ollama==0.3.2
openai==1.43.1
outlines==0.0.46
packaging==24.1
pandas==2.2.2
pillow==10.4.0
pyairports==2.1.1
pyarrow==17.0.0
pycountry==24.6.1
pydantic==2.9.0
pydantic_core==2.23.2
python-dateutil==2.9.0.post0
pytz==2024.1
PyYAML==6.0.2
referencing==0.35.1
regex==2024.7.24
requests==2.32.3
rpds-py==0.20.0
setuptools==74.1.2
six==1.16.0
sniffio==1.3.1
tiktoken==0.7.0
tqdm==4.66.5
typing_extensions==4.12.2
tzdata==2024.1
urllib3==2.2.2
xxhash==3.5.0
yarl==1.9.11
Context for the issue:
No response nooo
lapp0 commented
What happens when you don't pass the tokenizer? The endpoint should be returning text so a tokenizer shouldn't be necessary.
duarteocarmo commented
# must have ollama running
import tiktoken
from outlines import generate, models
model = models.openai(
"phi3.5",
base_url="http://localhost:11434/v1",
api_key="ollama",
)
generator = generate.text(model)
result = generator("Question: What's 2+2? Answer:", max_tokens=100)
print(result)
Error:
KeyError: 'Could not automatically map phi3.5 to a tokeniser. Please use `tiktoken.get_encoding` to explicitly get the tokeniser you expect.'
lapp0 commented
@duarteocarmo thanks for pointing this out to me.
I've removed the unnecessary tokenizer requirement. Could you please try
pip install --force-upgrade git+https://github.com/lapp0/outlines@openai-structured-generation
duarteocarmo commented
@lapp0 - that solves it! Thanks for the help :)
lapp0 commented
Great news, glad to help!
Pavel401 commented
Are you able to get a proper JSON response?
I am using ollama and llama 3.1 8B model .
I keep getting validation errors.