LLMs to take action and execute the actions.
This project demonstrates a basic LLM-Agent implementation of LLaMA-2 chat models. I have used the LLaMA-13-chat 4bit model, which I found to be more capable of following instructions than the LLaMA-7b-chat fp16 model. Moreover, it only requires about 10.5 GB of GPU memory.
You have to run two files inorder to run the agent. Run the client.
python run_client.py
Run the agent.
python agent.py