Authors: André Esteves up201606673@fe.up.pt Bruno Sousa up201604145@fe.up.pt Francisco Filipe up201604601@fe.up.pt Robotics - MIEIC 5th Year - FEUP Repository: https://github.com/Rekicho/robot_hide_seek Directory Organization: /launch: launch files to launch Gazeboo simulation whit TurtleBot3 robots. Adapted from: https://github.com/ROBOTIS-GIT/turtlebot3_simulations/tree/foxy-devel/turtlebot3_gazebo/launch /models: Adapted TurtleBot3 Burger models to add color according to robot role and allow each robot to subscribe and publish to different topics. Adapted from: https://github.com/ROBOTIS-GIT/turtlebot3_simulations/tree/foxy-devel/turtlebot3_gazebo/models/turtlebot3_burger /resource: ROS2 Package resource. /results: Results obtained during Deep Q-Learning training. For each episode, the final reward (indicating win/loss) along with the average reward are stored. /robotic_hide_seek: Developed code. Includes: deepqlearn: Deep Q-Learn implementation. Adapted from https://github.com/vmayoral/basic_reinforcement_learning deeptrain_hider: Script to train hiders using Deep Q-Learn. Adapted from https://github.com/vmayoral/basic_reinforcement_learning deeptrain_seeker: Script to train seekers using Deep Q-Learn. Adapted from https://github.com/vmayoral/basic_reinforcement_learning game_controller: Game Controller node. gazebo_connection: Script to pause/unpause/reset Gazebo simulation. Adapted from https://bitbucket.org/theconstructcore/drone_training/src/master/ hider_env: Open AI Gym enviroment for hider training hider_train: Hider that stores LIDAR sensors and does not send velocity commands. Used for training. hider: Hider node. qlearn: Deep Q-Learn implementation. Adapted from https://github.com/vmayoral/basic_reinforcement_learning seeker_env: Open AI Gym enviroment for seeker training seeker_train: Seeker that stores LIDAR sensors and does not send velocity commands. Used for training. seeker: Seeker node. train_hider: Script to train hiders using Q-Learn. Adapted from https://github.com/vmayoral/basic_reinforcement_learning train_seeker: Script to train seekers using Q-Learn. Adapted from https://github.com/vmayoral/basic_reinforcement_learning utils: Constants and utility function both for the game and for training. /training_results: Saved state from robots training. /hider and /seeker: Deep Q-Learn Neural Network weights hiders.txt and seekers.txt: Q-Learn Q table. /worlds: Worlds implemented for the Gazebo Simulations. Implemented the following (n_hider, n_seeker) configuration: (1,1), (2,1), (2,2), (1,2) build.sh: Script to build Package deeptrain_hider.sh and deeptrain_seeker.sh: Train hider/seeker using Deep Q-Learn. Assumes run_sim.sh is running. kill_all.sh: Kill all nodes package.xml: ROS2 package declaration run_game.sh: Runs a 2 hider, 2 seeker game run_sim.sh: Runs simulation with a given number of hiders and seekers (default: 2 each) setup.cfg and setup.py: Setup ROS2 package. train_hider and train_seeker: Train hider/seeker using Deep Q-Learn. Assumes run_sim.sh is running. Source code used: https://github.com/ROBOTIS-GIT/turtlebot3_simulations/tree/foxy-devel/ https://github.com/vmayoral/basic_reinforcement_learning https://bitbucket.org/theconstructcore/drone_training/src/master/ Dependencies: Tested with Ubuntu 20.04 ROS 2 Foxy Fitzroy Python 3 Gazebo ROS 2 Foxy Packages: rclpy turtlebot3 turtlebot3_msgs turtlebot3_simulations sensor_msgs geometry_msgs std_msgs rosgraph_msgs nav_msgs tf std_srvs Python 3 Packages: numpy keras tensorflow - GPU support isn't necessary but advised for faster Deep Q-Learning training (instructions: https://www.tensorflow.org/install/gpu ) gym transformations How to test: Install Python packages using pip Install ROS2 Foxy Create ~/ros2_ws folder Create src folder inside ros2_ws Install ROS2 packages Copy repository to ~/ros2_ws/src folder Add the following commands to ~/.bashrc: - source /opt/ros/foxy/setup.bash - source ~/ros2_ws/install/setup.bash - export GAZEBO_MODEL_PATH=$GAZEBO_MODEL_PATH:~/ros2_ws/src/turtlebot3/turtlebot3_simulations/turtlebot3_gazebo/models:~/ros2_ws/src/robot_hide_seek/models - export TURTLEBOT3_MODEL=burger - export ROS_DOMAIN_ID=30 #TURTLEBOT3 Change directory to ~/ros2_ws/src/robot_hide_seek Execute: $ chmod +x *.sh To build the package, run: $ ./build.sh To run the simulation: $ ./run_sim.sh Simulation can be run at real-world speed by doing the following in worlds/*.model: Comment "<real_time_update_rate>0</real_time_update_rate>" Change "<max_step_size>0.01</max_step_size>" to "<max_step_size>0.001</max_step_size>" Uncomment "<real_time_factor>1</real_time_factor>" Otherwise, simulation will run at ~15x real-world speed (to train faster) To run game: $ ./run_game.sh Constant GAME_USES_TRAINING in utils.py defines whether the robots should use Deep Q-Learn training results (when True) or basic AI (when False) To train using Q-Learning: Training results are loaded when training starts. To train from scrath replace training_results/hiders.txt or training_results/seekers.txt content with '{}'. For hider: $ ./train_hider.sh For seeker: $ ./train_seeker.sh GAME_USES_TRAINING should be set to False To train using Deep Q-Learning: Training results are loaded when training starts. To train from scrath delete training_results/hider or training_results/seeker folders. For hider: $ ./deeptrain_hider.sh For seeker: $ ./deeptrain_seeker.sh GAME_USES_TRAINING should be set to False