luk-s/Targeted-Manipulation-and-Deception-in-LLMs
A benchmark for evaluating the tendency of LLM agents to influence human preferences
Python
Stargazers
No one’s star this repository yet.
A benchmark for evaluating the tendency of LLM agents to influence human preferences
Python
No one’s star this repository yet.