berndpfrommer/basalt_ros1

Crash when attempting to use with a realsense D435i

martinakos opened this issue · 12 comments

I have successfully used basalt_ros1 with a T265 camera, through the realsense2_camera node.

Now I'm trying to do the same with a realsense D435i. I manually wrote the calibration file for the D435i. The T_imu_cam and intrinsics are the default ones in the memory of the D435i and the IMU values are copied from the T265 the configuration file (I assume this is correct as both cameras have the same IMU). The vio_config_file I use is the same one as for the T265.
However, when I launch the vio_node (having launched the realsense2_camera node) it crashes as shown below:

$ roslaunch basalt_ros1 vio_d435i_imu.launch 
... logging to /home/martinix/.ros/log/6903c06e-b32d-11eb-95e2-8c04ba280f13/roslaunch-martinix-12598.log
Checking log directory for disk usage. This may take a while.
Press Ctrl-C to interrupt
Done checking log file disk usage. Usage is <1GB.

started roslaunch server http://martinix:42323/

SUMMARY
========

CLEAR PARAMETERS
 * /vio_node/

PARAMETERS
 * /rosdistro: melodic
 * /rosversion: 1.14.9
 * /vio_node/calibration_file: /home/martinix...
 * /vio_node/imu_topics: ['imu']
 * /vio_node/odom_frame_id: basalt_pose_frame
 * /vio_node/vio_config_file: /home/martinix...
 * /vio_node/world_frame_id: world

NODES
  /
    vio_node (basalt_ros1/vio_node)

ROS_MASTER_URI=http://localhost:11311

process[vio_node-1]: started with pid [12616]
[ INFO] [1620830354.353280763]: loaded calibration file with 2 cameras
marg_H
1e+08     0     0     0     0     0     0     0     0     0     0     0     0     0     0
    0 1e+08     0     0     0     0     0     0     0     0     0     0     0     0     0
    0     0 1e+08     0     0     0     0     0     0     0     0     0     0     0     0
    0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
    0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
    0     0     0     0     0 1e+08     0     0     0     0     0     0     0     0     0
    0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
    0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
    0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
    0     0     0     0     0     0     0     0     0    10     0     0     0     0     0
    0     0     0     0     0     0     0     0     0     0    10     0     0     0     0
    0     0     0     0     0     0     0     0     0     0     0    10     0     0     0
    0     0     0     0     0     0     0     0     0     0     0     0   100     0     0
    0     0     0     0     0     0     0     0     0     0     0     0     0   100     0
    0     0     0     0     0     0     0     0     0     0     0     0     0     0   100
[ INFO] [1620830354.357204623]: imu subscribing to topic: imu
[ INFO] [1620830354.359028801]: backend started!
Setting up filter: t_ns 3241660697544086456
T_w_i
  0.995311  -0.060004  0.0758686          0
 -0.060004   0.232185   0.970819          0
-0.0758686  -0.970819   0.227495          0
         0          0          0          1
vel_w_i 0 0 0
[ INFO] [1620830357.925914772]: received image frame rate: 28.0582
Sophus ensure failed in function 'static Sophus::SO3<Scalar_> Sophus::SO3<Scalar_, Options>::expAndTheta(const Tangent&, Sophus::SO3<Scalar_, Options>::Scalar*) [with Scalar_ = double; int Options = 0; Sophus::SO3<Scalar_, Options>::Tangent = Eigen::Matrix<double, 3, 1>; Sophus::SO3<Scalar_, Options>::Scalar = double]', file '/home/martinix/develop/thirdparty/basalt_ros/src/basalt_ros/basalt/thirdparty/basalt-headers/thirdparty/Sophus/sophus/so3.hpp', line 619.
SO3::exp failed! omega: nan nan nan, real: nan, img: nan
[vio_node-1] process has died [pid 12616, exit code -6, cmd /home/martinix/develop/thirdparty/basalt_ros/devel/lib/basalt_ros1/vio_node ~left_image:=/camera/infra1/image_rect_raw ~right_image:=/camera/infra2/image_rect_raw ~imu:=/camera/imu __name:=vio_node __log:=/home/martinix/.ros/log/6903c06e-b32d-11eb-95e2-8c04ba280f13/vio_node-1.log].
log file: /home/martinix/.ros/log/6903c06e-b32d-11eb-95e2-8c04ba280f13/vio_node-1*.log
all processes on machine have died, roslaunch will exit
shutting down processing monitor...
... shutting down processing monitor complete
done

Any idea what the problem could be?

I really don't know. How is the IMU data looking? Is that coming in? The T265 publishes accel and gyro separately (depending on the driver config I believe). How is that for the D435i?

In the realsense2_camera launch file I have:

  <arg name="gyro_fps"            default="200"/>
  <arg name="accel_fps"           default="63"/>
  <arg name="enable_gyro"         default="true"/>
  <arg name="enable_accel"        default="true"/>
  <arg name="unite_imu_method"          default="linear_interpolation"/>

So this produces imu messages with interpolated accel. The imu messages are being produced at 200Hz and the contents look fine. This is the same setup I use in my T265 realsense2_camera launch file. In the launch file for the vio_node I specify to use the "imu" topic instead of "gyro" and "accel".

The D435i calibration file for the vio_node I use is:

{
    "value0": {
        "T_imu_cam": [
            {
                "px": 0.006,
                "py": -0.005,
                "pz": -0.012,
                "qx": 0.0,
                "qy": 0.0,
                "qz": 0.0,
                "qw": 1.0
            },
            {
                "px": -0.044, 
                "py": -0.005,
                "pz": -0.012,
                "qx": 0.0,
                "qy": 0.0,
                "qz": 0.0,
                "qw": 1.0
            }
        ],
        "intrinsics": [
            {
                "camera_type": "kb4",
                "intrinsics": {
                    "fx": 384.77935791015625,
                    "fy": 384.77935791015625,
                    "cx": 317.7252502441406,
                    "cy": 239.5641326904297,
                    "k1": 0.0,
                    "k2": 0.0,
                    "k3": 0.0,
                    "k4": 0.0
                }
            },
            {
                "camera_type": "kb4",
                "intrinsics": {
                    "fx": 384.77935791015625,
                    "fy": 384.77935791015625,
                    "cx": 317.7252502441406,
                    "cy": 239.5641326904297,
                    "k1": 0.0,
                    "k2": 0.0,
                    "k3": 0.0,
                    "k4": 0.0
                }
            }
        ],
        "resolution": [
            [
                640,
                480
            ],
            [
                640,
                480
            ]
        ],
        "calib_accel_bias": [
            -0.1240689605474472,
            -0.015732543542981149,
            -0.02691028267145157,
            -0.004307746887207031,
            0.0,
            0.0,
            0.026349782943725587,
            0.0,
            0.007716536521911621
        ],
        "calib_gyro_bias": [
            0.0015270404983311892,
            -0.0033326493576169016,
            0.00015611646813340485,
            -0.005540788173675537,
            0.0,
            0.0,
            0.0,
            -0.0004967451095581055,
            0.0,
            0.0,
            0.0,
            0.0002480745315551758
        ],
        "imu_update_rate": 200.0,
        "accel_noise_std": [
            0.008182447828798795,
            0.008182447828798795,
            0.008182447828798795
        ],
        "gyro_noise_std": [
            0.002268927090244338,
            0.002268927090244338,
            0.002268927090244338
        ],
        "accel_bias_std": [
            0.009999999873689375,
            0.009999999873689375,
            0.009999999873689375
        ],
        "gyro_bias_std": [
            0.0007071067802939111,
            0.0007071067802939111,
            0.0007071067802939111
        ],
        "cam_time_offset_ns": 0,
        "vignette": []
    }
}

I wonder if this calibration file is the culprit.

The D435i is not meant for VIO and has (at least) the following drawbacks over the T265:

  • too narrow field of view
  • rolling shutter!
    I suspect the camera and IMU output are also not synchronized.

One thing I'm fairly certain is that your distortion model is wrong. You give "kb4", which is the same as "equidistant" (Kalibr) and "fisheye" (opencv). If you think setting the distortion coeffs to zero is going to "switch off" the distortion, you are mistaken. This only works for "radtan" (plumb_bob), which defaults to pinhole if coeffs are zero. The "kb4" however specifies a fisheye projection function, which is different from the pinhole projection, i.e. you will have "distortion" on the corners even if the coeffs are all zero.
What does your image look like? Is it undistorted already? Look out for straight lines near the edge of the image. Are they straight? Look at the corresponding ROS camerainfo messages. Any distortion coefficients there?

I have only used Basalt with fisheye based cameras thus far, but it seems like there is a "pinhole" camera supported (without distortion). You may have to run an undistort node to convert the image if the driver doesn't already publish the rectified images. If the image is rectified, make sure you get the corresponding calibration, since typically the rectification not only sets the distortion to zero, but also messes with image center and focal length.

Here is where the code for the pinhole camera is. Again, I have not used it yet.

./thirdparty/basalt-headers/include/basalt/camera/pinhole_camera.hpp

This document indicates that the D400 series of cameras has "Brown" distortion model (looks like OpenCV plumb_bob aka radtan).

https://www.intel.com/content/dam/support/us/en/documents/emerging-technologies/intel-realsense-technology/RealSense_D400%20_Custom_Calib_Paper.pdf

After all this: I'm not sure that's the problem giving you the SO3 bomb-out right away. You may have to go back into the basalt code and piece-by-piece print out where the nans come from.

Thanks for the reply. I have changed the camera_type to pinhole. So the intrinsics section looks like this now:

        "intrinsics": [
            {
                "camera_type": "pinhole",
                "intrinsics": {
                    "fx": 384.77935791015625,
                    "fy": 384.77935791015625,
                    "cx": 317.7252502441406,
                    "cy": 239.5641326904297
                }
            },
            {
                "camera_type": "pinhole",
                "intrinsics": {
                    "fx": 384.77935791015625,
                    "fy": 384.77935791015625,
                    "cx": 317.7252502441406,
                    "cy": 239.5641326904297
                }
            }
        ],

However, I still get the same crash. I wonder if I could disable the imu use to determine if the problem comes from the visual part or the inertial part. I know in basalt there is a --use-imu 0 switch I could use, but I don't know if basalt_ros uses it. ?

The camera_info for the D435i shows no distortion parameters, see:

rostopic echo /camera/infra2/camera_info
---
header: 
  seq: 260
  stamp: 
    secs: 1620896022
    nsecs: 558945179
  frame_id: "camera_infra2_optical_frame"
height: 480
width: 640
distortion_model: "plumb_bob"
D: [0.0, 0.0, 0.0, 0.0, 0.0]
K: [384.77935791015625, 0.0, 317.7252502441406, 0.0, 384.77935791015625, 239.5641326904297, 0.0, 0.0, 1.0]
R: [1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0]
P: [384.77935791015625, 0.0, 317.7252502441406, 0.0, 0.0, 384.77935791015625, 239.5641326904297, 0.0, 0.0, 0.0, 1.0, 0.0]
binning_x: 0
binning_y: 0
roi: 
  x_offset: 0
  y_offset: 0
  height: 0
  width: 0
  do_rectify: False
---

The images from this camera have little distortion (although not null) so realsense sets the default distortion parameters to zero. I've used this camera with zero distortion parameters in orbslam3 and it works well enough.

Well, that's somewhat weird that the image is distorted, but the distortion coefficients are zero.

I have never tried the --use-imu 0 switch in Basalt and I was under the impression that it would not work without it, but I'd be happy to be surprised. I definitely don't remember implementing ROS support for non-imu operation.

The only way I see here to make progress is to trace backwards from the SO3 error via printf debugging. Would be particularly interesting to know if this error occurs the first time that section of the code is reached, which would indicate some setup/config error.

I made some progress on this. I calibrated the D435i cameras with Kalibr and I used a kb4 camera model. And the T_imu_cam I used initially was wrong, which alone could explain the crashes. Finally I got basalt_ros to work with a D435i with a live camera and rosbags.

However, sometimes I still get crashes. When it crashes is in the vio_back_end, as I can run the rosbags that crash through vio_front_end and visualize the optical flow without problems.

For example, I get a crash in the Sophus library. The reason is that there is nans in the SO3 object values. Tracking these nans down I see they originate in the optimize. The increment to apply to the pose contains nans, and that comes from the Hdiag matrix which has nans on it. See snapshot of the qtcreator debugger at the moment it crashes:
image

I'm not sure how these nans get in there but I noticed that accel and gyro biases at the start of KeypointVioEstimator::measure had huge values. I printed these biases when the meas pointer is reset with a IntegratedImuMeasurement object, at line 224 keypoint_vio.cpp and you can see below how the biases quickly increase until the crash happens.

SUMMARY
========

CLEAR PARAMETERS
 * /vio_node/

PARAMETERS
 * /rosdistro: melodic
 * /rosversion: 1.14.9
 * /vio_node/calibration_file: /home/martinix...
 * /vio_node/imu_topics: ['gyro', 'accel']
 * /vio_node/odom_frame_id: basalt_pose_frame
 * /vio_node/vio_config_file: /home/martinix...
 * /vio_node/world_frame_id: world

NODES
  /
    vio_node (basalt_ros1/vio_node)

ROS_MASTER_URI=http://localhost:11311

process[vio_node-1]: started with pid [12151]
[ INFO] [1621861036.672878919]: loaded calibration file with 2 cameras
marg_H
1e+08     0     0     0     0     0     0     0     0     0     0     0     0     0     0
    0 1e+08     0     0     0     0     0     0     0     0     0     0     0     0     0
    0     0 1e+08     0     0     0     0     0     0     0     0     0     0     0     0
    0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
    0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
    0     0     0     0     0 1e+08     0     0     0     0     0     0     0     0     0
    0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
    0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
    0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
    0     0     0     0     0     0     0     0     0    10     0     0     0     0     0
    0     0     0     0     0     0     0     0     0     0    10     0     0     0     0
    0     0     0     0     0     0     0     0     0     0     0    10     0     0     0
    0     0     0     0     0     0     0     0     0     0     0     0   100     0     0
    0     0     0     0     0     0     0     0     0     0     0     0     0   100     0
    0     0     0     0     0     0     0     0     0     0     0     0     0     0   100
[ INFO] [1621861036.697331675]: imu subscribing to topics: gyro, accel
[ INFO] [1621861036.707400492]: backend started!
Setting up filter: t_ns 1621522750069700352
T_w_i
  0.996581  0.0618293 -0.0548094          0
 0.0618293  -0.118021   0.991084          0
 0.0548094  -0.991084   -0.12144          0
         0          0          0          1
vel_w_i 0 0 0
last_state.getState().bias_accel:0 0 0 last_state.getState().bias_gyro:0 0 0
last_state.getState().bias_accel:0 0 0 last_state.getState().bias_gyro:0 0 0
last_state.getState().bias_accel:0 0 0 last_state.getState().bias_gyro:0 0 0
WARNING: DROPPING OLD FRAME, imu queue size: 0
last_state.getState().bias_accel:0 0 0 last_state.getState().bias_gyro:0 0 0
WARNING: DROPPING OLD FRAME, imu queue size: 0
last_state.getState().bias_accel:-2.28121e-27  1.05844e-26 -2.50928e-26 last_state.getState().bias_gyro:-1.44813e-26  8.76797e-27 -2.16878e-26
last_state.getState().bias_accel: 1.43758e-16 -3.91514e-16  1.39284e-15 last_state.getState().bias_gyro:-3.22629e-15 -1.66364e-16   1.0896e-15
WARNING: DROPPING OLD FRAME, imu queue size: 0
last_state.getState().bias_accel: 1.43758e-16 -3.91514e-16  1.39284e-15 last_state.getState().bias_gyro:-3.22629e-15 -1.66364e-16   1.0896e-15
last_state.getState().bias_accel: 56.8358 -36.3776  18.1185 last_state.getState().bias_gyro: -4.4435e+06 -2.51866e+06  2.18104e+07
last_state.getState().bias_accel:-36.7104  -44.033   243.47 last_state.getState().bias_gyro:-4.44438e+06 -2.51843e+06  2.18078e+07
last_state.getState().bias_accel: 56.6084 -350.378  619.986 last_state.getState().bias_gyro:  -4.445e+06 -2.51858e+06  2.18076e+07
last_state.getState().bias_accel: -38.708 -1069.28  281.001 last_state.getState().bias_gyro:-4.44506e+06  -2.5187e+06  2.18076e+07
last_state.getState().bias_accel: -38.737 -1070.81  279.656 last_state.getState().bias_gyro:-4.44506e+06  -2.5187e+06  2.18076e+07
last_state.getState().bias_accel:-38.9256 -1070.83  279.212 last_state.getState().bias_gyro:-4.44506e+06  -2.5187e+06  2.18076e+07
last_state.getState().bias_accel:-39.7185    -1071  277.209 last_state.getState().bias_gyro:-4.44506e+06  -2.5187e+06  2.18076e+07
last_state.getState().bias_accel:-69.5977 -1095.38   270.99 last_state.getState().bias_gyro:-4.44503e+06 -2.51866e+06  2.18076e+07
last_state.getState().bias_accel: 9.09569e+08 -6.18129e+09  6.78871e+08 last_state.getState().bias_gyro:-4.44531e+06 -2.51924e+06  2.18077e+07
last_state.getState().bias_accel: 2.98719e+09 -2.08118e+10  2.30671e+09 last_state.getState().bias_gyro:-4.46631e+06  -2.5563e+06  2.18119e+07
last_state.getState().bias_accel:6.24951e+09  -2.334e+10 2.86749e+09 last_state.getState().bias_gyro:-4.47705e+06 -2.57506e+06  2.18141e+07
last_state.getState().bias_accel: 1.31841e+10 -2.03437e+10  3.23629e+09 last_state.getState().bias_gyro:-4.47441e+06 -2.57047e+06  2.18136e+07
last_state.getState().bias_accel:  1.4462e+10 -1.97796e+10  3.30306e+09 last_state.getState().bias_gyro:-4.47441e+06 -2.57047e+06  2.18136e+07
last_state.getState().bias_accel:1.46174e+10 -1.9711e+10 3.31118e+09 last_state.getState().bias_gyro:-4.27277e+06 -2.22174e+06   2.1772e+07
Sophus ensure failed in function 'static Sophus::SO3<Scalar_> Sophus::SO3<Scalar_, Options>::expAndTheta(const Tangent&, Sophus::SO3<Scalar_, Options>::Scalar*) [with Scalar_ = double; int Options = 0; Sophus::SO3<Scalar_, Options>::Tangent = Eigen::Matrix<double, 3, 1>; Sophus::SO3<Scalar_, Options>::Scalar = double]', file '/home/martinix/develop/thirdparty/basalt_ros/src/basalt_ros/basalt/thirdparty/basalt-headers/thirdparty/Sophus/sophus/so3.hpp', line 619.
SO3::exp failed! omega: nan nan nan, real: nan, img: nan

Any ideas? I wonder if the problem is related with the warnings I get: "WARNING: DROPPING OLD FRAME, imu queue size: 0"?

At this point :
(your expertise in Basalt) > mine.

I have however seen before that nans show up in the optimizer because values grow out of bounds.

About the DROPPING_OLD_FRAME warnings: from my reading of the code this should never happen if the data is good. I believe you get these messages when your IMU time stamps are going backwards. Who knows what else is bad with the data. Proper time stamps are really important. If you have a spike in the acceleration and you integrate it up, and your delta_t is off it will really mess up the system because s = 1/2 a t^2. Can you check that the IMU time stamps are indeed messed up? Print out the time stamp when it gives you the "DROPPING OLD FRAME" messages, then look at the data around that time to see what's going on.

I've verified the timestamps for the messages and for the message headers for the rosbag that results in a crash and they are strictly increasing.
However, I think I know what's the problem. The rosbag I'm trying to get to work uses the "accel" and "gyro" topics but the accel topic is already linearly interpolated to the gyro topic. I see in the basalt_ros code that if the topics are "accel" and "gyro", the accel is interpolated to the gyro, so I end up having accel interpolated twice! Note that the rosbag wasn't recorded from the realsense ros driver directly, it was converted from a different type of dataset, so I don't have the option of producing a combined "imu" message as I would with the realsense ros driver.

I've verified that the problem was that the accel topic was already linearly interpolated to the gyro topic in my rosbag. I created a node that combines both accel and gyro messages from my rosbag into a imu message which I then use with basalt_ros and it works fine!

Actually, the problem is not gone yet. It seems it can work for some time and then it crashes with the nans in the SO3 problem. Curiously each time I run it it works a different amount of time until in crashes, and this is with the same rosbag. If I enable the debug messages I see that there is another of these "WARNING: DROPPING OLD FRAME" just before the optimizer make a huge increment and then it crashes. See below the last 3 frames before crashing:

states_to_remove 1
poses_to_marg.size() 0
states_to_marg.size() 1
state_to_marg_vel_bias.size() 0
kfs_to_marg.size() 0
keeping 57 marg 15 total 72
last_state_to_marg 1621522662317147392 frame_poses 7 frame_states 3
marginalizaon done!!
======== Marg nullspace ==========
nullspace x_trans: 66.5874 + -0.319601
nullspace y_trans: 72.2058 + 14.0512
nullspace z_trans: 125.317 + 1.11396
nullspace roll   : 49212.3 + 80.966
nullspace pitch  : 48678.5 + -1433.89
nullspace yaw    : 12434.6 + -455.768
nullspace random : 5.78513e+06 + -12356.5
=================================
connected0 110 unconnected0 36
=================================
[LINEARIZE] Error: 671.369 num points 
iter 0 before_update_error: vision: 2244.88 imu: 0.0495511 bg_error: 0 ba_error: 0 marg_prior: -1573.56 total: 671.369
iter 0  after_update_error: vision: 2214.74 imu: 0.676697 bg_error: 1.09057e-05 ba_error: 8.74987e-09 marg prior: -1577.25 total: 638.168 error_diff 33.2014 time : 3981(us),  num_states 3 num_poses 7
[LINEARIZE] Error: 638.168 num points 
iter 1 before_update_error: vision: 2214.74 imu: 0.676697 bg_error: 1.09057e-05 ba_error: 8.74987e-09 marg_prior: -1577.25 total: 638.168
iter 1  after_update_error: vision: 2213.63 imu: 0.702428 bg_error: 1.16219e-05 ba_error: 8.53818e-09 marg prior: -1577.19 total: 637.139 error_diff 1.02896 time : 3513(us),  num_states 3 num_poses 7
[LINEARIZE] Error: 637.139 num points 
iter 2 before_update_error: vision: 2213.63 imu: 0.702428 bg_error: 1.16219e-05 ba_error: 8.53818e-09 marg_prior: -1577.19 total: 637.139
iter 2  after_update_error: vision: 2213.56 imu: 0.708754 bg_error: 1.1817e-05 ba_error: 8.53893e-09 marg prior: -1577.23 total: 637.037 error_diff 0.101932 time : 3820(us),  num_states 3 num_poses 7
=================================
states_to_remove 1
poses_to_marg.size() 0
states_to_marg.size() 1
state_to_marg_vel_bias.size() 0
kfs_to_marg.size() 0
keeping 57 marg 15 total 72
last_state_to_marg 1621522662350443520 frame_poses 7 frame_states 3
marginalizaon done!!
======== Marg nullspace ==========
nullspace x_trans: 66.5869 + -0.328839
nullspace y_trans: 72.1907 + 14.3543
nullspace z_trans: 125.296 + 1.04042
nullspace roll   : 49328.1 + 83.9543
nullspace pitch  : 49051.7 + -1439.05
nullspace yaw    : 12380 + -462.725
nullspace random : 7.14856e+06 + -33906.4
=================================
connected0 102 unconnected0 40
=================================
[LINEARIZE] Error: 17973.5 num points 
iter 0 before_update_error: vision: 19551.7 imu: 0.342028 bg_error: 0 ba_error: 0 marg_prior: -1578.57 total: 17973.5
iter 0  after_update_error: vision: 7001.53 imu: 687.982 bg_error: 0.0181707 ba_error: 0.000191547 marg prior: -1546.76 total: 6142.77 error_diff 11830.7 time : 3942(us),  num_states 3 num_poses 7
[LINEARIZE] Error: 6142.77 num points 
iter 1 before_update_error: vision: 7001.53 imu: 687.982 bg_error: 0.0181707 ba_error: 0.000191547 marg_prior: -1546.76 total: 6142.77
iter 1  after_update_error: vision: 5018.72 imu: 1101.43 bg_error: 0.033811 ba_error: 0.000173979 marg prior: -1543.66 total: 4576.52 error_diff 1566.25 time : 3737(us),  num_states 3 num_poses 7
[LINEARIZE] Error: 4576.52 num points 
iter 2 before_update_error: vision: 5018.72 imu: 1101.43 bg_error: 0.033811 ba_error: 0.000173979 marg_prior: -1543.66 total: 4576.52
iter 2  after_update_error: vision: 4431.4 imu: 1296.03 bg_error: 0.0413213 ba_error: 0.000160778 marg prior: -1545.73 total: 4181.75 error_diff 394.775 time : 3769(us),  num_states 3 num_poses 7
[LINEARIZE] Error: 4181.75 num points 
iter 3 before_update_error: vision: 4431.4 imu: 1296.03 bg_error: 0.0413213 ba_error: 0.000160778 marg_prior: -1545.73 total: 4181.75
iter 3  after_update_error: vision: 4208.82 imu: 1388.02 bg_error: 0.044937 ba_error: 0.000152254 marg prior: -1546.43 total: 4050.46 error_diff 131.287 time : 3234(us),  num_states 3 num_poses 7
[LINEARIZE] Error: 4050.46 num points 
iter 4 before_update_error: vision: 4208.82 imu: 1388.02 bg_error: 0.044937 ba_error: 0.000152254 marg_prior: -1546.43 total: 4050.46
iter 4  after_update_error: vision: 4114.06 imu: 1431.25 bg_error: 0.0466394 ba_error: 0.000148119 marg prior: -1546.78 total: 3998.58 error_diff 51.8817 time : 3663(us),  num_states 3 num_poses 7
[LINEARIZE] Error: 3651.8 num points 
iter 5 before_update_error: vision: 3767.28 imu: 1431.25 bg_error: 0.0466394 ba_error: 0.000148119 marg_prior: -1546.78 total: 3651.8
iter 5  after_update_error: vision: 3846.77 imu: 1374.2 bg_error: 0.0442651 ba_error: 0.000156621 marg prior: -1543.13 total: 3677.88 error_diff -26.0854 time : 6343(us),  num_states 3 num_poses 7
increased error after update!!!
[LINEARIZE] Error: 3677.88 num points 
iter 6 before_update_error: vision: 3846.77 imu: 1374.2 bg_error: 0.0442651 ba_error: 0.000156621 marg_prior: -1543.13 total: 3677.88
iter 6  after_update_error: vision: 3875 imu: 1359.06 bg_error: 0.0436608 ba_error: 0.000158099 marg prior: -1542.76 total: 3691.33 error_diff -13.4517 time : 3939(us),  num_states 3 num_poses 7
increased error after update!!!
=================================
states_to_remove 1
poses_to_marg.size() 0
states_to_marg.size() 1
state_to_marg_vel_bias.size() 0
kfs_to_marg.size() 0
keeping 57 marg 15 total 72
last_state_to_marg 1621522662383739392 frame_poses 7 frame_states 3
marginalizaon done!!
======== Marg nullspace ==========
nullspace x_trans: 66.5863 + -0.110157
nullspace y_trans: 72.1757 + 15.8992
nullspace z_trans: 125.275 + 3.5193
nullspace roll   : 49037.5 + 451.898
nullspace pitch  : 49067.1 + -858.267
nullspace yaw    : 12294.4 + -506.231
nullspace random : 4.08397e+06 + 42900.3
=================================
connected0 102 unconnected0 38
=================================
[LINEARIZE] Error: 5559.64 num points 
iter 0 before_update_error: vision: 6011.67 imu: 1315.99 bg_error: 0 ba_error: 0 marg_prior: -1768.02 total: 5559.64
iter 0  after_update_error: vision: 4473.54 imu: 1565.55 bg_error: 0.000383298 ba_error: 3.52696e-06 marg prior: -1742.23 total: 4296.85 error_diff 1262.79 time : 4512(us),  num_states 3 num_poses 7
[LINEARIZE] Error: 4296.85 num points 
iter 1 before_update_error: vision: 4473.54 imu: 1565.55 bg_error: 0.000383298 ba_error: 3.52696e-06 marg_prior: -1742.23 total: 4296.85
iter 1  after_update_error: vision: 4012.32 imu: 1717.02 bg_error: 0.000379169 ba_error: 5.54399e-06 marg prior: -1743.37 total: 3985.97 error_diff 310.888 time : 3717(us),  num_states 3 num_poses 7
[LINEARIZE] Error: 3985.97 num points 
iter 2 before_update_error: vision: 4012.32 imu: 1717.02 bg_error: 0.000379169 ba_error: 5.54399e-06 marg_prior: -1743.37 total: 3985.97
iter 2  after_update_error: vision: 3851.65 imu: 1781.19 bg_error: 0.00036005 ba_error: 6.50404e-06 marg prior: -1742.77 total: 3890.07 error_diff 95.8963 time : 4405(us),  num_states 3 num_poses 7
[LINEARIZE] Error: 3890.07 num points 
iter 3 before_update_error: vision: 3851.65 imu: 1781.19 bg_error: 0.00036005 ba_error: 6.50404e-06 marg_prior: -1742.77 total: 3890.07
iter 3  after_update_error: vision: 3793.58 imu: 1806.65 bg_error: 0.000356834 ba_error: 6.91218e-06 marg prior: -1742.3 total: 3857.94 error_diff 32.1297 time : 3916(us),  num_states 3 num_poses 7
[LINEARIZE] Error: 3857.94 num points 
iter 4 before_update_error: vision: 3793.58 imu: 1806.65 bg_error: 0.000356834 ba_error: 6.91218e-06 marg_prior: -1742.3 total: 3857.94
iter 4  after_update_error: vision: 3770.96 imu: 1817.15 bg_error: 0.000357014 ba_error: 7.08222e-06 marg prior: -1742.12 total: 3845.99 error_diff 11.9497 time : 3596(us),  num_states 3 num_poses 7
[LINEARIZE] Error: 3845.99 num points 
iter 5 before_update_error: vision: 3770.96 imu: 1817.15 bg_error: 0.000357014 ba_error: 7.08222e-06 marg_prior: -1742.12 total: 3845.99
iter 5  after_update_error: vision: 3761.57 imu: 1821.64 bg_error: 0.000357452 ba_error: 7.15491e-06 marg prior: -1742.06 total: 3841.15 error_diff 4.83931 time : 3608(us),  num_states 3 num_poses 7
[LINEARIZE] Error: 3841.15 num points 
iter 6 before_update_error: vision: 3761.57 imu: 1821.64 bg_error: 0.000357452 ba_error: 7.15491e-06 marg_prior: -1742.06 total: 3841.15
iter 6  after_update_error: vision: 3757.42 imu: 1823.65 bg_error: 0.000357829 ba_error: 7.18627e-06 marg prior: -1742.03 total: 3839.04 error_diff 2.11243 time : 8803(us),  num_states 3 num_poses 7
=================================
states_to_remove 1
poses_to_marg.size() 0
states_to_marg.size() 1
state_to_marg_vel_bias.size() 0
kfs_to_marg.size() 0
keeping 57 marg 15 total 72
last_state_to_marg 1621522662483639808 frame_poses 7 frame_states 3
marginalizaon done!!
======== Marg nullspace ==========
nullspace x_trans: 66.5848 + -0.220446
nullspace y_trans: 72.1317 + 15.1177
nullspace z_trans: 125.214 + 7.22388
nullspace roll   : 48420.9 + 588.215
nullspace pitch  : 48676.2 + -464.695
nullspace yaw    : 12179 + -488.211
nullspace random : 5.16243e+06 + 18312.9
=================================
WARNING: DROPPING OLD FRAME, imu queue size: 3
connected0 96 unconnected0 47
=================================
[LINEARIZE] Error: 32912.4 num points 
iter 0 before_update_error: vision: 34564.1 imu: 7.71309 bg_error: 0 ba_error: 0 marg_prior: -1659.48 total: 32912.4
iter 0  after_update_error: vision: 7.27074e+08 imu: 2.75184e+187 bg_error: 2.59672e+82 ba_error: 2.1819e+88 marg prior: 5.79041e+91 total: 2.75184e+187 error_diff -2.75184e+187 time : 3593(us),  num_states 3 num_poses 7
increased error after update!!!
[LINEARIZE] Error: 2.75184e+187 num points 
Sophus ensure failed in function 'static Sophus::SO3<Scalar_> Sophus::SO3<Scalar_, Options>::expAndTheta(const Tangent&, Sophus::SO3<Scalar_, Options>::Scalar*) [with Scalar_ = double; int Options = 0; Sophus::SO3<Scalar_, Options>::Tangent = Eigen::Matrix<double, 3, 1>; Sophus::SO3<Scalar_, Options>::Scalar = double]', file '/home/martinix/develop/thirdparty/basalt_ros/src/basalt_ros/basalt/thirdparty/basalt-headers/thirdparty/Sophus/sophus/so3.hpp', line 619.
SO3::exp failed! omega: nan nan nan, real: nan, img: nan

One thing I notice is that the latest version of basalt doesn't have that "WARNING: DROPPING OLD FRAME". I see the version of basalt you use is from last Septembre. I'm going to try to update it to the latest and see if I still get the crash.

I find out the problem! and now balsat_ros seems to process the whole rosbag properly. The problem was in the node I made that combines both accel and gyro messages from my rosbag into a imu message, it was missing messages from time to time. I solved that and I could track the rosbag without crash. So we could close this issue.

On the other hand, I tried to merge the latest basalt on the basalt version you use. However when I build I would get some errors and didn't continue in that direction. It would be nice to have basalt_ros wrapping the latest basalt.

@martinakos thank you for doing some hard debugging for the D435i, I'm trying to use the D435i too, in the end how did you integrate the topics? Did you write a node to combine the data? But I would assume that you changed the <arg name="unite_imu_method" default=""/> to linear-interpolation and get an camera/imu topic for an easier fix

<rosparam param="imu_topics">["gyro", "accel"]</rosparam>
    <remap from="~gyro" to="/camera/gyro/sample" />
    <remap from="~accel" to="/camera/accel/sample" />

And also for the calibration on Kalibr, wouldn't pinhole be the original model for the camera and not kb4? Did you see any performance loss with the kb4