"Error: Could not find valid timed solution!" using .a library
jpaulos opened this issue · 4 comments
This library is wonderful.
I've written a Python interface directly to your .a library for use in a Python program (no Matlab or ROS runtime dependencies). It works robustly with b_sync_W=true, but if I set b_sync_W=false I can occasionally find edge cases where I get an "Error: Could not find valid timed solution!"
The Matlab reference implementation does solve these same test cases correctly, so the error may very well be on my end. In the test cases I fail, one axis solution matches the Matlab solution and the other axis is invalid. I have a few questions I hope would point me in the right direction:
-
I see the mex file (matlab/opt_control_lib_mex.mexa64) had a bug fix 15 months ago, but the library (ROS/opt_control/lib/opt_control_lib.a) is much older than that. Is there any chance the library I'm using doesn't have the most up to date bug fixes?
-
Other than the printed error message itself, do any of the function's return values give error codes to help narrow down where the problem lies?
-
Do you have any plans to release the underlying Matlab code? If I could compile the library myself, I suspect it would be easier to find the discrepancy between the Matlab and C++/Python solutions.
Hello jpaulos, thank you for commenting!
If you would like to share the Python interface, I can commit it so others can also benefit!
As you said, the software does not handle every corner case correctly. This is a known issue and specifically happens with nonzero initial and final velocities (also sometimes in the MATLAB version). I am working on it but can NOT guarantee a release date. Sorry!
I intent do release the source code soon, but I can NOT guarantee a date here as well.
If you provide the exact inputs that cause your error, I can have a brief look to confirm that they trigger the known issue...
I will be very happy to contribute the Python interface once it's robust; I think it's close.
I was able to find a corner case that the ROS demo fails but the Matlab demo passes. Since the provided compiled C++ *.a is 7 months older than the Matlab *.mex, I'm suspicious/hopeful the *.a just needs to be recompiled with the latest bugfixes.
Here is a 3-axis corner case:
start: [0.1, 0.2, 0.5] with zero velocity and acceleration
end: [0.0, 0.0, 0.0] with zero velocity and acceleration
velocity limits all -10/+10
acceleration limits all -5/+5
jerk limits all -100/+100
b_comp_global = false;
b_sync_V = true
b_sync_A = true
b_sync_J = false
b_sync_W = false
b_rotate = false
b_best_solution = true
b_hard_vel_limit = false
b_catch_up = true
To see that this PASSES in the Matlab code, you can swap these two code snips into the demo example.m:
In the switch statement:
case 9
num_axes = 3;
num_trajectories = 1;
State_start = [ 0.1 0.0 0.0;
0.2 0.0 0.0;
0.5 0.0 0.0];
Waypoints(:,:,1) = [ 0.0 0.0 0.0 0.0 0.0;
0.0 0.0 0.0 0.0 0.0;
0.0 0.0 0.0 0.0 0.0];
V_max = [ 10.0;
10.0;
10.0];
V_min = [-10.0;
-10.0;
-10.0];
A_max = [ 5;
5;
5];
A_min = [-5;
-5;
-5];
J_max = [ 100.0;
100.0;
100.0];
J_min = [-100.0;
-100.0;
-100.0];
A_global = [ 0.0;
0.0;
0.0];
And below that:
b_comp_global = false;
b_sync_V = true(num_axes,num_trajectories);
b_sync_A = true(num_axes,num_trajectories);
b_sync_J = false(num_axes,num_trajectories);
b_sync_W = false(num_axes,num_trajectories);
b_rotate = false(1,num_trajectories);
b_best_solution = true(num_axes,num_trajectories);
b_hard_vel_limit = false(num_axes,num_trajectories);
b_catch_up = true(num_axes,num_trajectories);
solution_in = -1 * ones(num_axes,2,num_trajectories,'int16');
ts_rollout = 0.01;
To see that this FAILS in the ROS code, you can use this .cfg file, hard code the initial conditions in main.cpp, and clean/compile/run:
The config file:
#!/usr/bin/env python
PACKAGE = "opt_control"
from dynamic_reconfigure.parameter_generator_catkin import *
gen = ParameterGenerator()
gen.add("V_max", double_t, 0, "Maximum Velocity (m/s)", 10.0)
gen.add("V_min", double_t, 0, "Minimum Velocity (m/s)", -10.0)
gen.add("A_max", double_t, 0, "Maximum Acceleration (m^2/s)", 5.0)
gen.add("A_min", double_t, 0, "Minimum Acceleration (m^2/s)", -5.0)
gen.add("J_max", double_t, 0, "Maximum Jerk (m^3/s)", 100.0)
gen.add("J_min", double_t, 0, "Minimum Jerk (m^3/s)", -100.0)
gen.add("b_sync_V", bool_t, 0, "sync trajectories with V?", True)
gen.add("b_sync_A", bool_t, 0, "sync trajectories with A?", True)
gen.add("b_sync_J", bool_t, 0, "sync trajectories with J?", False)
gen.add("b_sync_W", bool_t, 0, "sync trajectories with W?", False)
gen.add("b_rotate", bool_t, 0, "Rotate trajectory?", False)
gen.add("b_best_solution", bool_t, 0, "Find best solution?", True)
gen.add("b_hard_vel_limit", bool_t, 0, "Hard velocity limit?", False)
gen.add("b_catch_up", bool_t, 0, "Catch up trajectories?", True)
exit(gen.generate(PACKAGE, "opt_control", "OptControl"))
Then hard code initial condition at line 68 in main.cpp.
double P_init[num_axes] = {0.1, 0.2, 0.5};
double V_init[num_axes] = {0, 0, 0};
double A_init[num_axes] = {0, 0, 0};
Then build and launch.
$ catkin_make clean
$ catkin_make
$ roslaunch opt_control opt_control.launch
On launch, you will immediately get the message "Error: Could not find valid timed solution!"
If you now set b_sync_W to True it will immediately begin working.
$ rosrun dynamic_reconfigure dynparam set /opt_control b_sync_W true
Since the provided compiled C++ *.a is 7 months older than the Matlab *.mex, I'm suspicious/hopeful the *.a just needs to be recompiled with the latest bugfixes.
Hello jpaulos,
thank you for the detailed report!
I was able to reproduce the issue on my machine. Actually the error was not known to me until now.
The huge symmetric jerk limits cause some sort of numerical instability I never experienced before. If I change the limits to e.g. +100/-99.9998 or any other nonsymmetric value, the error does NOT occur (+100/-99.9999 again causes the error). Can you reproduce this behavior on your side?
If yes, please either use asymmetric jerk limits or reduce the total jerk limit until I have a fix.
Best,
Marius
Yes, I reproduce the same behavior on that particular example. I was also able to go back and find other corner cases which required a larger jerk limit modification to resolve (eg. +100/-97). In general I can also trigger the same error message with small and asymmetric jerks limits, but that may have a different root cause.
In case it's helpful, here may be a related corner case that is also affected by adjusting the jerk limits. The difference is that this case actually segfaults and crashes the program.
double P_init[num_axes] = {-0.9, 1.0, -0.3};
double V_init[num_axes] = { 0.1, 1.0, -0.7};
double A_init[num_axes] = { 0, 0, 0};
double P_wayp[num_axes] = { 1.0, 1.0, 1.0};
double V_wayp[num_axes] = { 1.0, -1.0, -1.0};
double A_wayp[num_axes] = { 0, 0, 0};
Otherwise the example parameters are the same as above. The program crashes with an error:
Index exceeds array dimensions. Index value 1 exceeds valid range [1-0] of array index_setp_max.
Error in evaluate_to_time (line 25)
Aborted (core dumped)
Once again this example passes in Matlab using the *.mex. I would expect the Matlab-compiled *.a and the *.mex to yield the same behavior, so I'm suspicious that the compiled library is out of date with the latest Matlab bugfixes.
Since these behaviors are the same in both ROS and Python, I'm growing fairly confident the Python interface to the compiled library is working correctly. If interested, you can find the Python fork here. This is not ready for inclusion in your library, but it may be useful for testing. It includes some scripts for randomized testing that I find a lot more convenient than setting up ROS test cases.
https://github.com/jpaulos/opt_control/blob/master/python/README.md