AOS

Links:

Experiments and Videos:

Videos:

Real Panda robot tic-tac-toe experiments:

Basic experiment:

skills documentation can be found at.
In this experiment, we programmed a Panda CoBot to play tic-tac-toe with a human. An Intel RealSense D415 camera was attached to the robot arm, and an erasable board with a tic-tac-toe grid was placed within its reach. The experiment was based on two skills: marking a circle in a specific grid cell, and detecting a change in the board state and extracting the new board state. The first skill was implemented using our own PID controller based on libfranka, which we wrapped as ROS service. The second skill was adapted from code found on the web. After experimenting with the code to see its properties, PLP and AM files were specified for each skill. The AOS allows the specification of PLPs that describe exogenous events and are executed prior to every agent’s action. This feature was used to model the human’s action. We modeled the human as making random legal choices. Finally, in the environment file we defined the goal reward, the initial state of an empty board and the starting player. From this point, it was plug’n play, requiring no additional effort. The AOS auto-generated the code, and we run the game (changing the starting player, as desired). Because the human was modeled as a random player, you can observe in the videos that the robot sometimes ”counts” on a human mistake of not completing a sequence of three.

Basic tic-tac-toe vidoes:

Probabilistic tic-tac-toe experiments:

skills documentation can be found at.
We tested the AOS’s ability to adapt to changes in the robot, environment, and task. First, we changed the circle drawing skill to emulate an arm with difficulty drawing in the center square, succeeding only in half the cases. The turn changes following the robot’s attempt regardless of the outcome. The only effort required on our part was to trivially modify the draw circle skill’s PLP to reflect its modified behavior. The POMDP-solver now optimizes given this up- dated PLP, and you can observe in the videos that the robot prefers drawing a circle in other positions when possible, only selecting the center square when it is crucial. Imagine the effort of changing a script to adapt to this capability change. Notice that a classical solver cannot even model it.

Probabilistic tic-tac-toe videos:

Changing the rules of the game experiments:

skills documentation can be found at.
Our next experiment considered the case of a new task: players get a score for marking positions adjacent to corner squares they marked before. This is a different game played with the same skills. To play it, the AOS requires minimal user effort of modifying the reward function. The resulting behavior, however, is quite different. For this, a completely new script would have to be written and would require figuring out good strategies for this game. Classical planners would not be able to model this objective well.

tic-tac-toe changing the game rules videos:

Unknown Initial state experiments:

skills documentation can be found at.
In this experiment, the robot starts with a board that was already played for three moves. We documented the initial belief, such that any legal three moves are possible. An autonomous robot must be able to start from different possible states of the environment. And again, this is not a manually written script for tic-tac-toe. It is a general-purpose algorithm to operate an autonomous robot to maximize its objectives in a partially observable stochastic environment.

tic-tac-toe with unknown initial state videos:

Armadillo Gazebo experiments:

A detailed description of how we built this experiment and the documentation used can be found at.
The experiment simulation environment included a room with two tables, and a corridor with a person. Each table had a can on it. One of the cans was very difficult to pick (its true size was 10% of the size perceived by the robot). The robot was located near the table with the difficult can. The goal was to give the can to a person in the corridor. We implemented three skills: pick-can, navigate to a person or a table, and serve- can which handed the can to the person. For the experiments, we used two versions of the pick PLP: a ”rough” model that assumes that the probability of successful pick is independent of the selected table, and a ”finer” model in which the success probability is conditioned on the robot’s position.

First, we experimented with each skill and used observed statistics of their behavior to write their PLP files. Then, we specified the AM files and the task specification. As above, this information was enough to enable the AOS to control the robot through the task. During plan execution, we observed that, occasionally, the pick skill ends with the arm outstretched. Attempting to serve the person in this state causes a collision (i.e., injured the person). Moreover, pick returned success if motion planning and motion execution succeeded, but this did not imply that the can was successfully picked. Therefore, we wrote two new skills: detect-hold-can and detect-arm- stretched. This was almost immediate for two reasons: these sensing skills are easy to define using the AM files because they simply map low-level data published by the robot (gripper pressure, arm-joint angles) into the abstract variables used by the PLPs. We also implemented an alternative pick skill with integrated success sensing. Its return value reflected the outcome of sensing whether the can is held. This, too, is very easy to do through the output specification in the PLP. Both changes involve adding two lines to the respective file. Detect-hold-can is noisy and was modeled as such. Detect- arm-stretched is not noisy.

First, with the rough model, the robot (correctly) tries to pick the problematic can because it saves the cost of navigating to the other table, while with the finer mode, it first moves to the other table where pick is more likely to succeed. Second, without sensing actions, the robot serves the can, but then, because it has no feedback, goes back to the tables and tries to repeat the process, while with sensing, the robot verifies success, if the result is yes, only then does it serve the can and stop. Moreover, since sensing is noisy, the robot performs multiple sense actions to achieve a belief state with less uncertainty because the results of the sensing actions are modeled as independent. However, when sensing is integrated into the pick action, it cannot do independent sensing, and repeating the pick action is not desirable.

Armadillo Gazebo videos:

Real Armadillo Robot experiments:

skills documentation can be found at.
Next, we tested the integration of AOS with the real Armadillo in a simple, clean-lab task. In our lab, there are six workstations and two trash cans. There are two empty cups somewhere in stations 1,2, or 3 and two in stations 4,5 or 6. The robot can navigate to each of the stations and trash cans. It can observe if an empty cup is placed on a nearby station; it can pick a nearby cup and throw one into a nearby trash can. The robot should collect all the cups and place them in the trash cans as fast as possible. Navigation cost is relative to the distance to the destination, so the robot would find the shortest path to perform its task. We implemented each of these skills, which are deterministic, and mapped our lab using ROS’ gmapping package [] so we can use ROS’s navigation stack. We documented the skills, environment, and robot objective and activated the robot that utilized its skills to clean our lab as expected. As above, the only integration effort needed was preparing the documentation files.

Real Armadillo Robot videos:

turtleBot3 Gazebo experiments:

skills documentation can be found at.
This is a first integration experiment.
The video shows:
Starting the AOS.
We are sending an HTTP request to integrate and operate the robot by the PLPs. The robot simulation.
Sending another HTTP request to see a) the sequence of actions sent by the solver with their details and response given by the code modules b) the belief state as maintained by the AOS during the process (we defined to see only one particle of the belief state but this is configurable). This video demonstrates how the AOS finds the shortest path for the robot to visit seven critical points on the map. The user wrote a PLP describing his robot navigation skill and an environment file describing his goal: to visit all points while traveling the shortest distance. In this simple example, the initial state is known, and the outcomes of the actions are deterministic (navigation always succeeds). The solver planning time was set to 0.1 seconds per action.
The AOS is automatically integrating the user code. The user only needs to send a request to the AOS Restful API (HTTP request), and the integration and execution are performed automatically.

turtleBot3 videos:

AOS Installtion

Requirements

  • Ubuntu 20.04

Installtion Steps

AOS-ML-Server

AOS GUI

The GUI allows users:

  • To create new project documentation or edit existing ones (it only supports editing and creation of the JSON format)
  • Debug and visualize their model correctness, robot execution, and the progress of the belief state during execution.
  • AOS-GUI github
  • AOS-GUI project page
  • AOS-GUI video(less then three minutes)

Tutorial: