BT action client of type nav2_msgs::action::NavigateToPose sends ActionNodeErrorCode SEND_GOAL_TIMEOUT

Question

BT action client of type nav2_msgs::action::NavigateToPose sends ActionNodeErrorCode SEND_GOAL_TIMEOUT

Opened this issue a month ago · 7 comments

Running ROS 2 Humble on Ubuntu 22.04

The problem

I'm using a behavior tree that has a BT action node to send a goal to the Nav2 navigate_to_pose action server. My problem is that if the robot navigates for more than 10 seconds (which is the default server_timeout of a BT::RosActionNode), the node's onFailure method is triggered and returns the error code SEND_GOAL_TIMEOUT. From what I've read of the bt_action_node.hpp file, this means that the client did not receive a response regarding whether the goal was accepted or not from the server within 10 seconds.

The thing is that even if I haven't received a goal_response, the Nav2 action server runs normally and the robot navigates (since async_send_goal is used I'm guessing). I think the problem lies in the way that bt_action_node.hpp (line 459 and after) checks the future to receive the goal_handle. It seems to never return rclcpp::FutureReturnCode::SUCCESS in the spin_until_future_complete with no delay.

What's weird is that if the navigation time is under 10 seconds, then the BT action node does not fail, but I can see in the prints that the goal_response_callback only gets called after the robot reaches its navigation goal. This means that the response about whether the action goal is accepted or not is only received after the robot is done navigating. I also don't receive any feedback from the action server during the navigation, probably since I have no goal_handle.

I also have another BT action node for docking in my behavior tree (where the action server was custom-made), and I don't seem to have a problem with it since I can see in the debug prints that the goal was accepted and I receive feedback.
Because of this, I'm not sure if it's a Nav2 problem with the way that the navigate_to_pose action server was written or an edge case that BehaviorTree.ROS2 did not consider.

Maybe a fix

I found a way to make it work for the navigate_to_pose action by setting the nodelay in bt_action_node.hpp (line 452) to 10ms instead of 0ms:
auto nodelay = std::chrono::milliseconds(0); -> auto nodelay = std::chrono::milliseconds(10);
I don't have a clear understanding of the reason why the spin_until_future_complete has a timeout of 0ms. I guess it's to have a non-blocking behavior.

I've recreated the same issue with a simple action client node that calls the navigate_to_pose action server and handles the goal accepted response with the same logic that bt_action_node.hpp has. I'll leave the .hpp and .cpp code if you want to test it.

.hpp

#ifndef NAVIGATE_TO_POSE_CLIENT_HPP
#define NAVIGATE_TO_POSE_CLIENT_HPP

#include <rclcpp/rclcpp.hpp>
#include <rclcpp_action/rclcpp_action.hpp>
#include <geometry_msgs/msg/pose_stamped.hpp>
#include <nav2_msgs/action/navigate_to_pose.hpp>

class NavigateToPoseClient : public rclcpp::Node
{
public:
    using NavigateToPose = nav2_msgs::action::NavigateToPose;
    using GoalHandleNavigateToPose = rclcpp_action::ClientGoalHandle<NavigateToPose>;

    explicit NavigateToPoseClient(const rclcpp::NodeOptions & options = rclcpp::NodeOptions());

    void send_goal(const geometry_msgs::msg::PoseStamped & goal_pose);
    bool spin_until_goal_received();

private:
    void goal_response_callback(const rclcpp_action::ClientGoalHandle<NavigateToPose>::SharedPtr & goal);
    void feedback_callback(GoalHandleNavigateToPose::SharedPtr, const std::shared_ptr<const NavigateToPose::Feedback> feedback);
    void result_callback(const GoalHandleNavigateToPose::WrappedResult & result);

    rclcpp_action::Client<NavigateToPose>::SharedPtr action_client_;
    std::shared_future<GoalHandleNavigateToPose::SharedPtr> future_goal_handle_;
    GoalHandleNavigateToPose::SharedPtr goal_handle_;
    rclcpp::Time time_goal_sent_;
    bool goal_received_ = false;
};

#endif // NAVIGATE_TO_POSE_CLIENT_HPP

.cpp

#include "osc_nav/nodes/navigate_to_pose_client.hpp"

NavigateToPoseClient::NavigateToPoseClient(const rclcpp::NodeOptions & options)
    : Node("navigate_to_pose_client", options)
{
    action_client_ = rclcpp_action::create_client<NavigateToPose>(this, "navigate_to_pose");

    // Wait for the action server to be available
    while (!action_client_->wait_for_action_server(std::chrono::seconds(10))) {
        RCLCPP_INFO(this->get_logger(), "Waiting for action server to be available...");
    }
}

void NavigateToPoseClient::send_goal(const geometry_msgs::msg::PoseStamped & goal_pose)
{
    if (!action_client_) {
        RCLCPP_ERROR(this->get_logger(), "Action client not initialized");
        return;
    }

    auto goal_msg = NavigateToPose::Goal();
    goal_msg.pose = goal_pose;

    RCLCPP_INFO(this->get_logger(), "Sending goal");

    auto send_goal_options = rclcpp_action::Client<NavigateToPose>::SendGoalOptions();
    send_goal_options.goal_response_callback = std::bind(&NavigateToPoseClient::goal_response_callback, this, std::placeholders::_1);
    send_goal_options.feedback_callback = std::bind(&NavigateToPoseClient::feedback_callback, this, std::placeholders::_1, std::placeholders::_2);
    send_goal_options.result_callback = std::bind(&NavigateToPoseClient::result_callback, this, std::placeholders::_1);

    future_goal_handle_ = action_client_->async_send_goal(goal_msg, send_goal_options);
    time_goal_sent_ = this->now();
}

bool NavigateToPoseClient::spin_until_goal_received()
{
    auto nodelay = std::chrono::milliseconds(0); // 0ms creates the bug where the goal accpeted is received only at the end. 10ms solves the issue
    auto timeout = rclcpp::Duration::from_seconds(double(10));
    RCLCPP_WARN(this->get_logger(), "Timeout: %f", timeout.seconds());

    auto ret = rclcpp::spin_until_future_complete(this->get_node_base_interface(), future_goal_handle_, nodelay);
    if(ret != rclcpp::FutureReturnCode::SUCCESS)
    {
        if((this->now() - time_goal_sent_) > timeout)
        {
            RCLCPP_ERROR(this->get_logger(), "Goal timed out");
            return false;
        }
        else
        {
            return false;
        }
    }
    else
    {
        goal_received_ = true;
        goal_handle_ = future_goal_handle_.get();
        future_goal_handle_ = {};

        if(!goal_handle_)
        {
            RCLCPP_ERROR(this->get_logger(), "Goal rejected by server");
            return false;
        }
        return true;
    }
}

void NavigateToPoseClient::goal_response_callback(const rclcpp_action::ClientGoalHandle<NavigateToPose>::SharedPtr & goal)
{
    auto goal_handle = goal.get();
    if (!goal_handle) {
        RCLCPP_ERROR(this->get_logger(), "Goal was rejected by the action server");
    } else {
        RCLCPP_INFO(this->get_logger(), "Goal accepted by the action server, waiting for result");
    }
}

void NavigateToPoseClient::feedback_callback(GoalHandleNavigateToPose::SharedPtr, const std::shared_ptr<const NavigateToPose::Feedback> feedback)
{
    RCLCPP_INFO(this->get_logger(), "Received feedback: distance remaining = %f", feedback->distance_remaining);
}

void NavigateToPoseClient::result_callback(const GoalHandleNavigateToPose::WrappedResult & result)
{
    switch (result.code) {
        case rclcpp_action::ResultCode::SUCCEEDED:
            RCLCPP_INFO(this->get_logger(), "Goal succeeded");
            break;
        case rclcpp_action::ResultCode::ABORTED:
            RCLCPP_ERROR(this->get_logger(), "Goal was aborted");
            break;
        case rclcpp_action::ResultCode::CANCELED:
            RCLCPP_ERROR(this->get_logger(), "Goal was canceled");
            break;
        default:
            RCLCPP_ERROR(this->get_logger(), "Unknown result code");
            break;
    }
}

int main(int argc, char ** argv)
{
    rclcpp::init(argc, argv);
    auto node = std::make_shared<NavigateToPoseClient>();

    // Example goal pose, replace with your desired goal
    geometry_msgs::msg::PoseStamped goal_pose;
    goal_pose.header.frame_id = "map";
    goal_pose.header.stamp = node->now();
    goal_pose.pose.position.x = 0.0;
    goal_pose.pose.position.y = -1.0;
    goal_pose.pose.orientation.w = 1.0;

    node->send_goal(goal_pose);

    while (rclcpp::ok())
    {
        if (node->spin_until_goal_received())
        {
            break;
        }
        rclcpp::sleep_for(std::chrono::milliseconds(100)); // Small sleep to prevent busy-waiting
    }

    rclcpp::spin(node);
    rclcpp::shutdown();
    return 0;
}

Answer 1 · 2024-06-07T12:06:50.000Z

I think this explains better issue #48

Answer 2 · 2024-06-11T15:34:11.000Z

please try branch issue_76 and let me know if problem persists

Answer 3 · 2024-06-11T20:34:30.000Z

@facontidavide Unfortunately, it does not seem to solve the issue. I still receive a SEND_GOAL_TIMEOUT error after 10 seconds. I changed the log level of the tree_executor to debug. Here's the traceback, maybe it can help you:

... 10s later

And my overridden onFailure method of my BT::RosActionNode<Nav2>

I still have the same behavior where my other BT action node for docking works properly but not the nav2 one. I also tried adding a max_duration to the spin_some like I did previously with the timeout of the spin_until_future_complete, but it did not work like last time:

client_instance_->callback_executor.spin_some(std::chrono::nanoseconds(10000000));

Answer 4 · 2024-06-12T12:00:43.000Z

I need a way to reproduce this. Is there any dummy server and client that can be used to reproduce the error?

It seems to me like a potential error in the server. Also, which version of ROS are you using?

Answer 5 · 2024-06-12T20:26:27.000Z

I'm running ROS2 Humble on Ubuntu 22.04

I joined a package with a custom BT with a simple sequence where a condition waits for geometry_msgs::msg::Pose on a topic to be received then set it to the blackboard where a Nav2_action_client can take it and send an action goal to the Nav2 action server /navigate_to_pose.

dummy_behavior_tree.zip

All you will need is:

A workspace with a pre-setup robot running nav2. I tried it with Turtlebot3 running example of Nav2 and it works I can reproduce the bug https://docs.nav2.org/getting_started/index.html#running-the-example
Run the tree_executor node in a separate terminal with this launch cmd ros2 launch dummy_behavior_tree bt.launch.py you can use the verbose parameter to activate or deactivate the StdCoutLogger (verbose:=false)
Send in another terminal a goal pose of your choice on this topic /dummy_goal_pose with this cmd line ros2 topic pub /dummy_goal_pose geometry_msgs/msg/Pose "position: x: 0.0 y: 0.0 z: 0.0 orientation: x: 0.0 y: 0.0 z: 0.0 w: 1.0 " -1
Make sur the ETA of the navigation is greater than the server_timeout default to 10s to get the problem. If it's lesser than 10s, then you will see that the callback saying that the goal is accepted by the server triggers only once the robot reaches it's goal.

Here's the behavior tree :

I hope I didn't leave any bugs in the dummy_behavior_tree package. If I did, let me know, and I'll try to fix them. I can also spend some time helping you solve this bug if you don't have the time. However, I have limited knowledge of the behaviortree.ros2 package, so it might take me longer to find and properly resolve the issue.

Answer 6 · 2024-06-25T14:44:22.000Z

@facontidavide, were you able to reproduce the error? I also realized that downloading a zip from a GitHub comment might not be ideal. If you prefer, I can create a public repository with the code instead of the zip file.

Answer 7 · 2024-07-15T13:35:55.000Z

@facontidavide @AnthonyBreton the branch issue_76 doesn't seem to fix the bugs on my side either.