self_balancing Training a robot with an Actor-Critic Network in gazebo URDF of simplified mode Classic PID controller Classic controller replaced by Deep Reinforced learning algorithm