cheshire-cat-ai/core

When using llama-cpp-python as backend, an exception is thrown when executing the agent chain

Closed this issue · 5 comments

As mentioned on Discord, I have set-up the cat with a local llama-cpp-python server, serving a mistral-7b-instruct-v0.2.Q6_K.gguf LLM.
Among the possible settins, I left the default "stop" paramer as "Human:,###".

Here's all configs:
Url: http://cheshire_cat_llamacppserver:8000
Temperature: 0.7
Max Tokens: 2048
Stop: Human:,###
Top K: 40
Top P: 0.95
Repeat Penalty: 1.1

Now, if I check the logs, I see an Exception thrown after "Entering new AgentExecutor chain":

cheshire_cat_core | > Entering new AgentExecutor chain...
cheshire_cat_core |
cheshire_cat_core |
cheshire_cat_core | > Entering new LLMChain chain...
cheshire_cat_core | Prompt after formatting:
cheshire_cat_core | Answer the following question: hello, how are you?
cheshire_cat_core | You can only reply using these tools:
cheshire_cat_core |
cheshire_cat_core | get_the_time: get_the_time(tool_input) - Replies to "what time is it", "get the clock" and similar questions. Input is always None.
cheshire_cat_core | none_of_the_others: none_of_the_others(None) - Use this tool if none of the others tools help. Input is always None.
cheshire_cat_core |
cheshire_cat_core | If you want to use tools, use the following format:
cheshire_cat_core | Action: the name of the action to take, should be one of [get_the_time]
cheshire_cat_core | Action Input: the input to the action
cheshire_cat_core | Observation: the result of the action
cheshire_cat_core | ...
cheshire_cat_core | Action: the name of the action to take, should be one of [get_the_time]
cheshire_cat_core | Action Input: the input to the action
cheshire_cat_core | Observation: the result of the action
cheshire_cat_core |
cheshire_cat_core | When you have a final answer respond with:
cheshire_cat_core | Final Answer: the final answer to the original input question
cheshire_cat_core |
cheshire_cat_core | Begin!
cheshire_cat_core |
cheshire_cat_core | Question: hello, how are you?
cheshire_cat_core |
cheshire_cat_core | [2024-01-16 13:13:28.082] ERROR cat.looking_glass.agent_manager.AgentManager.execute_agent::179 => ValueError('stop found in both the input and default params.')
cheshire_cat_core | Traceback (most recent call last):
cheshire_cat_core | File "/app/cat/looking_glass/agent_manager.py", line 144, in execute_agent
cheshire_cat_core | tools_result = self.execute_tool_agent(agent_input, allowed_tools, stray)
cheshire_cat_core | File "/app/cat/looking_glass/agent_manager.py", line 80, in execute_tool_agent
cheshire_cat_core | out = agent_executor(agent_input)
cheshire_cat_core | File "/usr/local/lib/python3.10/site-packages/langchain/chains/base.py", line 316, in call
cheshire_cat_core | raise e
cheshire_cat_core | File "/usr/local/lib/python3.10/site-packages/langchain/chains/base.py", line 310, in call
cheshire_cat_core | self._call(inputs, run_manager=run_manager)
cheshire_cat_core | File "/usr/local/lib/python3.10/site-packages/langchain/agents/agent.py", line 1312, in _call
cheshire_cat_core | next_step_output = self._take_next_step(
cheshire_cat_core | File "/usr/local/lib/python3.10/site-packages/langchain/agents/agent.py", line 1038, in _take_next_step
cheshire_cat_core | [
cheshire_cat_core | File "/usr/local/lib/python3.10/site-packages/langchain/agents/agent.py", line 1038, in
cheshire_cat_core | [
cheshire_cat_core | File "/usr/local/lib/python3.10/site-packages/langchain/agents/agent.py", line 1066, in _iter_next_step
cheshire_cat_core | output = self.agent.plan(
cheshire_cat_core | File "/usr/local/lib/python3.10/site-packages/langchain/agents/agent.py", line 531, in plan
cheshire_cat_core | output = self.llm_chain.run(
cheshire_cat_core | File "/usr/local/lib/python3.10/site-packages/langchain/chains/base.py", line 516, in run
cheshire_cat_core | return self(kwargs, callbacks=callbacks, tags=tags, metadata=metadata)[
cheshire_cat_core | File "/usr/local/lib/python3.10/site-packages/langchain/chains/base.py", line 316, in call
cheshire_cat_core | raise e
cheshire_cat_core | File "/usr/local/lib/python3.10/site-packages/langchain/chains/base.py", line 310, in call
cheshire_cat_core | self._call(inputs, run_manager=run_manager)
cheshire_cat_core | File "/usr/local/lib/python3.10/site-packages/langchain/chains/llm.py", line 103, in _call
cheshire_cat_core | response = self.generate([inputs], run_manager=run_manager)
cheshire_cat_core | File "/usr/local/lib/python3.10/site-packages/langchain/chains/llm.py", line 115, in generate
cheshire_cat_core | return self.llm.generate_prompt(
cheshire_cat_core | File "/usr/local/lib/python3.10/site-packages/langchain_core/language_models/llms.py", line 525, in generate_prompt
cheshire_cat_core | return self.generate(prompt_strings, stop=stop, callbacks=callbacks, **kwargs)
cheshire_cat_core | File "/usr/local/lib/python3.10/site-packages/langchain_core/language_models/llms.py", line 698, in generate
cheshire_cat_core | output = self._generate_helper(
cheshire_cat_core | File "/usr/local/lib/python3.10/site-packages/langchain_core/language_models/llms.py", line 562, in _generate_helper
cheshire_cat_core | raise e
cheshire_cat_core | File "/usr/local/lib/python3.10/site-packages/langchain_core/language_models/llms.py", line 549, in _generate_helper
cheshire_cat_core | self._generate(
cheshire_cat_core | File "/usr/local/lib/python3.10/site-packages/langchain_community/llms/openai.py", line 429, in _generate
cheshire_cat_core | sub_prompts = self.get_sub_prompts(params, prompts, stop)
cheshire_cat_core | File "/usr/local/lib/python3.10/site-packages/langchain_community/llms/openai.py", line 547, in get_sub_prompts
cheshire_cat_core | raise ValueError("stop found in both the input and default params.")
cheshire_cat_core | ValueError: stop found in both the input and default params.
cheshire_cat_core |

After some investigation, this is an error thrown by LangChain, not the cat. If I understand how it works, the hint of what's happening is given in the last line: ValueError: stop found in both the input and default params.
It looks like the "stop" parameter is passed twice (but I don't know where) and that causes the error that prevents the Agent chain to be evaluated properly.

Workaround: if I set the llama-cpp-server config "stop" empty, then everything works as expected.

You understand right but if you set the stop as empty then the model start to generate endless tokens, isn't it?

Not always, but often...
Of course this is not a big priority, as you said on Discord the agent chain does not work very well with local LLMs.

we know that agent chain don't work with locals and we are building an agent for that. If you want any update check the thread "Tool Prompt for local LLMS" under development channel.
If you wanna help, you can share the output from the file py i pinned in the thread.

Indeed, the more I learn how the code works, the more I understand that the issue is bigger. I'll check Discord as well, and try to contribute. thank you

Stale