PX4/PX4-GPSDrivers

receive function never returns in some conditions for UBX drivers

AlexisTM opened this issue · 18 comments

I don't know if it is intended, but I have the following conditions:

  • ret > 0
  • _interface == Interface::UART
    => There is no return.
    Because of that, the configure function is never returning and I am stuck in an infinite loop.

https://github.com/PX4/GpsDrivers/blob/e84bb0a7a702320fedade6c83bbdf2324e3be8fb/src/ubx.cpp#L431-L485

Don't we have to add on line 476: return handled?

What receiver are you using? There is a timeout that avoids infinite loops.

I am using the M8P-2: HERE+ base station.
The platform is Linux but not using QTSerial.

I did not pinpointed the location of the loop, it seems to call receive() continuously, waiting for it to return something. Instead, it seems to always timeout.

Adding the return seems to make it work as I can now go further than the configure and receive GPS coordinates.

It might be misconfigured as well; I will try to have the RTCM messages this afternoon.

@bkueng I confirm that without this modification, my GPS device is never configured from Linux on my node;

I also confirm that I can receive the RTCM messages - so that with the modification, the configuration succeeds.

NOTE: The node I refer to is: https://github.com/AlexisTM/rtk_ros

The existing logic there is indeed a bit strange.
But I see that you ignore the timeout in https://github.com/AlexisTM/rtk_ros/blob/master/rtk_ros/include/rtk_ros/rtk_node.hpp#L203, which you should not.
Can you use the timeout correctly and retry?

As I am not using QTSerial, therefore the function does not exist.

The equivalent is void Serial::waitByteTimes | ( | size_t | count | ) which waits an amount of time equivalent to the time it would take to read count bytes.

How important is the timeout duration? I can also set a fixed timeout duration.

@bkueng
I just tried the following but without the return in configure; it never finishes configuring the GPS.

                if (serial->available() == 0) {
                    serial::Timeout timeout = serial->getTimeout();
                    int timeout_duration = *((int *) data1);
                    timeout.read_timeout_constant = timeout_duration;
                    serial->setTimeout(timeout);
                    if (!serial->waitReadable())
                        return 0; // error, no new data
                }
                return (int)serial->read((uint8_t *) data1, data2);

It actually continues forever saying: "I would have returned: 0".

It is stuck the second time the receive function is called.

The read function from the serial therefore times out properly.

image

To return on that very command; I added a count which returns only for the second call (count == 1).

The GPS properly configures itself as well.

			} else {
				if(count == 1)
					return handled;
				count ++;
				ROS_FATAL_STREAM("I would have returned: " << handled);
			}

image

I also show the value of ready_to_return.

image

It stays "false" forever.

NOTE:

bool ready_to_return = _configured ? (_got_posllh && _got_velned) : handled;

The issue only happens when we configure the GPS.
In that case:

  • _configured is false => ready_to_return = handled
  • During the first call, handled = 1.
  • During the second call, handled stays to 0.
  • Afterwards, _configured is true => The problem is gone.
  • ret is always > 0 (messages are incomming)
  • The messages are properly parsed by parseChar
  • The timeout using gps_absolute_time is not firing

Solution: Use a different time method which does not rely on the GPS for the timeout?

ret is always > 0 (messages are incomming)

This should not happen, as the GPS does not send a lot of data.

I see that your definition here is incorrect: https://github.com/AlexisTM/rtk_ros/blob/master/rtk_ros/include/rtk_ros/definitions.h#L20. This needs to be defined as a method that returns the current time in us. See here for example: https://github.com/mavlink/qgroundcontrol/blob/master/src/GPS/definitions.h#L76

@bkueng I am pretty sure this is my problem! Thanks!

The problem of "ret always > 0" comes from the fact that the library I use the waitReadable function which waits until there is something available.

http://docs.ros.org/kinetic/api/serial/html/classserial_1_1Serial.html

Ok so is everything working now?

I still have the issue but it is due to a different serial library to avoid the QT dependency. It works fine otherwise.