When using llama-cpp-python as backend, an exception is thrown when executing the agent chain

Question

When using llama-cpp-python as backend, an exception is thrown when executing the agent chain

Closed this issue 5 months ago · 5 comments

As mentioned on Discord, I have set-up the cat with a local llama-cpp-python server, serving a mistral-7b-instruct-v0.2.Q6_K.gguf LLM.
Among the possible settins, I left the default "stop" paramer as "Human:,###".

Here's all configs:
Url: http://cheshire_cat_llamacppserver:8000
Temperature: 0.7
Max Tokens: 2048
Stop: Human:,###
Top K: 40
Top P: 0.95
Repeat Penalty: 1.1

Now, if I check the logs, I see an Exception thrown after "Entering new AgentExecutor chain":

cheshire_cat_core cheshire_cat_core |
cheshire_cat_core |
cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core |
cheshire_cat_core cheshire_cat_core cheshire_cat_core |
cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core | ...
cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core |
cheshire_cat_core cheshire_cat_core cheshire_cat_core |
cheshire_cat_core cheshire_cat_core |
cheshire_cat_core cheshire_cat_core |
cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core cheshire_cat_core | | > Entering new AgentExecutor chain...
| > Entering new LLMChain chain...
| Prompt after formatting:
| Answer the following question: hello, how are you?
| You can only reply using these tools:
| get_the_time: get_the_time(tool_input) - Replies to "what time is it", "get the clock" and similar questions. Input is always None.
| none_of_the_others: none_of_the_others(None) - Use this tool if none of the others tools help. Input is always None.
| If you want to use tools, use the following format:
| Action: the name of the action to take, should be one of [get_the_time]
| Action Input: the input to the action
| Observation: the result of the action
| Action: the name of the action to take, should be one of [get_the_time]
| Action Input: the input to the action
| Observation: the result of the action
| When you have a final answer respond with:
| Final Answer: the final answer to the original input question
| Begin!
| Question: hello, how are you?
| [2024-01-16 13:13:28.082] ERROR cat.looking_glass.agent_manager.AgentManager.execute_agent::179 => ValueError('stop found in both the input and default params.')
| Traceback (most recent call last):
| File "/app/cat/looking_glass/agent_manager.py", line 144, in execute_agent
| tools_result = self.execute_tool_agent(agent_input, allowed_tools, stray)
| File "/app/cat/looking_glass/agent_manager.py", line 80, in execute_tool_agent
| out = agent_executor(agent_input)
| File "/usr/local/lib/python3.10/site-packages/langchain/chains/base.py", line 316, in call
| raise e
| File "/usr/local/lib/python3.10/site-packages/langchain/chains/base.py", line 310, in call
| self._call(inputs, run_manager=run_manager)
| File "/usr/local/lib/python3.10/site-packages/langchain/agents/agent.py", line 1312, in _call
| next_step_output = self._take_next_step(
| File "/usr/local/lib/python3.10/site-packages/langchain/agents/agent.py", line 1038, in _take_next_step
| [
| File "/usr/local/lib/python3.10/site-packages/langchain/agents/agent.py", line 1038, in
| [
| File "/usr/local/lib/python3.10/site-packages/langchain/agents/agent.py", line 1066, in _iter_next_step
| output = self.agent.plan(
| File "/usr/local/lib/python3.10/site-packages/langchain/agents/agent.py", line 531, in plan
| output = self.llm_chain.run(
| File "/usr/local/lib/python3.10/site-packages/langchain/chains/base.py", line 516, in run
| return self(kwargs, callbacks=callbacks, tags=tags, metadata=metadata)[
| File "/usr/local/lib/python3.10/site-packages/langchain/chains/base.py", line 316, in call
| raise e
| File "/usr/local/lib/python3.10/site-packages/langchain/chains/base.py", line 310, in call
| self._call(inputs, run_manager=run_manager)
| File "/usr/local/lib/python3.10/site-packages/langchain/chains/llm.py", line 103, in _call
| response = self.generate([inputs], run_manager=run_manager)
| File "/usr/local/lib/python3.10/site-packages/langchain/chains/llm.py", line 115, in generate
| return self.llm.generate_prompt(
| File "/usr/local/lib/python3.10/site-packages/langchain_core/language_models/llms.py", line 525, in generate_prompt
| return self.generate(prompt_strings, stop=stop, callbacks=callbacks, **kwargs)
| File "/usr/local/lib/python3.10/site-packages/langchain_core/language_models/llms.py", line 698, in generate
| output = self._generate_helper(
| File "/usr/local/lib/python3.10/site-packages/langchain_core/language_models/llms.py", line 562, in _generate_helper
| raise e
| File "/usr/local/lib/python3.10/site-packages/langchain_core/language_models/llms.py", line 549, in _generate_helper
| self._generate(
| File "/usr/local/lib/python3.10/site-packages/langchain_community/llms/openai.py", line 429, in _generate
| sub_prompts = self.get_sub_prompts(params, prompts, stop)
| File "/usr/local/lib/python3.10/site-packages/langchain_community/llms/openai.py", line 547, in get_sub_prompts
| raise ValueError("stop found in both the input and default params.")
| ValueError: stop found in both the input and default params.

After some investigation, this is an error thrown by LangChain, not the cat. If I understand how it works, the hint of what's happening is given in the last line: ValueError: stop found in both the input and default params.
It looks like the "stop" parameter is passed twice (but I don't know where) and that causes the error that prevents the Agent chain to be evaluated properly.

Workaround: if I set the llama-cpp-server config "stop" empty, then everything works as expected.

pieroit commented 5 months ago

Stale

Answer 1 · 2024-01-16T12:32:30.000Z

You understand right but if you set the stop as empty then the model start to generate endless tokens, isn't it?

Answer 2 · 2024-01-16T13:28:02.000Z

Not always, but often...
Of course this is not a big priority, as you said on Discord the agent chain does not work very well with local LLMs.

Answer 3 · 2024-01-18T15:40:49.000Z

we know that agent chain don't work with locals and we are building an agent for that. If you want any update check the thread "Tool Prompt for local LLMS" under development channel.
If you wanna help, you can share the output from the file py i pinned in the thread.

Answer 4 · 2024-01-19T10:40:08.000Z

Indeed, the more I learn how the code works, the more I understand that the issue is bigger. I'll check Discord as well, and try to contribute. thank you