This documentation provides an overview of the LLMs (Large Language Models) for Robot Planning project. The project leverages state-of-the-art language models to enable robots to perform planning tasks more effectively and efficiently by adding human feeddback to the plan generation process. The project is split into 2 parts:
- Physical Evaluation
- Simulated Evaluation
The physical evaluation is involved with running the system on a Niryo robot arm. To run this evaluation you would need the Niryro robot arm as well as an Intel RealSense D435i image and depth camera. The robot will have to be connected to the same network as the control unit (which can be a laptop), while the camera will be physically connected to the control unit using a usb type c connection cable. Once the physical aparatus has been configured, the operator can then connect to the robot by speicfiying the robot's network ip address in the main.py file in the physicalevaluation\
directory.
The simulated evaluation involves the use of the VimaBench which uses the pybulltet simulator, to use this evaluation, install the requirements inside the simulated evaluation folder simulatedevaluation\requirements.txt
and set your API key in the simulatedevaluation\visual_programming_prompt\robot_exec_generation.py
file.
The primary objective of this project is to develop a system that integrates Large Language Models (LLMs) to enhance the planning capabilities of robots. LLMs, such as GPT-3.5, will be used to generate natural language instructions for robots and assist them in making intelligent decisions during planning and execution.
To set up and run the project, you will need the following:
- Robot hardware with necessary sensors and actuators
- A computer with a compatible operating system (Linux, Windows, macOS)
- Python 3.x
- Required Python libraries (e.g., TensorFlow, PyTorch, Hugging Face Transformers)
- OPEN AI KEYS
- Internet connection (for model access)
The system architecture consists of the following components:
-
Robot Hardware: This includes the physical robot equipped with sensors and actuators for interaction with its environment.
-
LLM Integration: A pretrained LLM (e.g., GPT-3.5) is integrated into the system. The LLM can generate natural language instructions and assist with decision-making.
-
Planning Module: The planning module is responsible for receiving input from the LLM, interpreting it, and generating high-level plans for the robot's tasks.
-
Execution Module: The execution module takes the plans generated by the planning module and translates them into low-level control signals for the robot.
To install and set up the project, follow these steps:
-
Clone the project repository from GitHub: git clone https://github.com/asuzukosi/ned_lang.git
-
Install the required Python dependencies using
pip
: -
Obtain access to a pretrained LLM model (e.g., GPT-3.5) and set up the API credentials or connection as needed.
To use the system, follow these general steps:
-
Configure the robot hardware and ensure it's connected to the system.
-
Start the system, including the LLM integration, planning module, and execution module.
-
Provide tasks or queries to the system in natural language.
-
The LLM generates instructions, and the planning module interprets them to create high-level plans.
-
The execution module translates high-level plans into low-level commands for the robot to carry out the tasks.
-
Monitor and fine-tune the system's behavior as needed.
Here are some examples of how to use the system:
- User Input: "Pick up the red ball and place it on the table."
- LLM Output: "The red ball should be picked up and placed on the table."
- System Execution: The robot follows the instruction to pick up the red ball and place it on the table.