laxnpander/OpenREALM

Enabling fallback mode changes the results of tracked imagery

zthorson opened this issue · 8 comments

Summary

When running the same set with fallback enabled and disabled, the final results can be very different. Even if the test set only kicks the very first frame (frame zero) through the fallback condition. This appears to modify the resulting projection of all the following VSLAM tracking.

Details

The issue seems to be amplified when the camera parameters (focal length in particular) are incorrect. While it does have an effect with correct parameters, it appears to be less.

To create a good sample set, I dropped the fx and fx camera parameters by a few hundred pixels. This makes the issue more apparent as seen in the samples below. The samples are both run on the same dataset with the same setting, with only the fallback_mode changed.

result_canvas_500_fb0
result_canvas_500_fb1

Something about the frames during initialization being pushed out of the pose_estimator appears to be messing up the georeferencing in the following calls.

It looks like it comes down to SurfaceGeneration::createPlanarSurface. This call in the surface generation node locks in on the first received frame.

DigitalSurfaceModel::Ptr SurfaceGeneration::createPlanarSurface(const Frame::Ptr &frame)
{
if (!m_is_projection_plane_offset_computed)
{
m_projection_plane_offset = computeProjectionPlaneOffset(frame);
m_is_projection_plane_offset_computed = true;
}

Is there a reason that this isn't updated on a frame by frame basis? I know it might result in a terrain map that is not planer overall, but is planar in the regions that individual images are projected.

Alternately, I could keep it locking on the first frame, but allow it to re-lock if the system transitions from inaccurate poses to accurate poses for the first time.

Thoughts?

@zthorson: Hmmm, it's a while ago I implemented this. I think, as you said, the problem is that changing what's "ground" on a frame to frame basis might result in worse relative alignment. I am pretty sure I tried that first, but went with this approach for a reason. Did you try the results for the frame to frame check?

Some notes on the basic concept: Originally, I thought I need to take care of the scenario when relative altitude is "off". For example your mission is to fly in 100m over an area, but your mission starts on elevation 0m and is operated at elevation 500m. Consequently your flight is at 600m relative altitude compared to your takeoff. Projection will be useless in this scenario as I can only assume the ground is 600m away. Consequently, the idea was to use the visual information (sparse cloud) to identify how far away ground is at mission start and just correct the relative altitude offset once and keep it like that for the sake of relative integrity.

Your observation might be related to the fact, that a bad camera calibration results in bad 3D points (sparse cloud), which in turn results in a bad estimation of "ground" and therefore a bad projection. Not sure if a frame to frame or re-locking can do something about this?

@laxnpander I had the same or slightly better performance with frame by frame adjustments on my 7 sample flight regression tests. While none of them have severe terrain changes, some do have slight hills that seem to work better with frame by frame adjustment. The elevation maps in the PNG can vary quite a bit, but overall seem to track the expected terrain (if you consider the fact that it's an average across a wide area).

I think the main issue I am seeing with the bad camera calibration, is since the first image in fallback mode is guaranteed to be a non accurate pose (since VSLAM takes a few frames to initialize), it always sets the planar offset to 0 as it's fallback (in SurfaceGeneration::computreProjectionPlaneOffset). So once VSLAM comes back with a better estimate of the offset using keypoints, it doesn't get used.

Since I appear to be getting more reliable results and more robust error recovery with frame to frame adjustment, would you be opposed to me adding a switch in the surface_generation node to allow frame_by_frame elevation? Maybe a parameter called planar_frame_by_frame which is set to 0 by default for the current behavior, and 1 to allow for testing?

Here are some examples both will fallback on, but one with frame by frame adjustment, and one with global adjustment. The camera parameters should be correct or very close to correct for both runs.

The larger fields like "slight hill" and CattleGrass exhibit improvments in the edges of the field where the elevation appears to differ. The subfolders for each dataset contain both the final geotiff as well as the elevation maps that were generated.

The config folder in the root shows the exact settings and yaml files used for OpenREALM as well.

Original (No Frame-by-Frame) Fallback Enabled
Updated (Every Frame) Fallback Enabled

@zthorson Hmmm, will have a more detailed look at it next week. You are absolutely right that the first frame passing through shouldnt trigger the offset computation as it can only fail due to missing sparse cloud. But I also never intended it to do so. Interesting I didn't notice it back then. The m_is_projection_plane_offset_computed should only be set true of the computation was actually performed. So it should be moved inside the if clause here

if (frame->isDepthComputed())

Will check the rest of your comment soon, thanks for the suggestions!

@laxnpander The only downside I see to moving the flag into the isDepthComputed() clause is extra log messages if you are running with vslam disabled (it will never get a depth map). That doesn't seem like a big deal to me, since it seems like people would rarely want to run with VSLAM disabled entirely.

In the cases I was seeing, it would only result in extra logs until tracking starts, which is reasonable.

@laxnpander Moving m_is_projection_plane_offset_computed definitely helps with the fallback mode. The first image or two gets projected with the offset set to 0, while the remaing use the newly computed offset. Here is the elevation.png map with and without the fix. With the fix, only the first image is projected to 0.0 offset.

Keep in mind the images below are intentional worst case with camera parameters intentionally miscalculated.

Current State:
no_fix
result_canvas

Move m_is_projection_plane_offset_computed
with_fix
result_canvas

However, overall in other datasets I have been seeing better alignment computing the elevation with every frame. This especially appears to be the case on very large datasets with rolling hills.

@zthorson: Glad to hear it got better. It’s quite significantly better, great idea to check it that way! I am also surprised the tracking works with such large calibration tweaks. But nice idea I will keep it in mind.

An additional flag won’t hurt to activate frame to frame correction nevertheless. It’s a small computation and if it could improve visual results, why not.

@laxnpander Ok, I already implemented a test of it. I settled on compute_all_frames for a name, that defaults to off. I'm open to other naming if you have a preference.
add("compute_all_frames", Parameter_t<int>{0, "When in PLANAR mode, estimate the elevation of every frame using sparse data rather than locking on the first"});

I'll get it into a pull request sometime tomorrow, but in the mean time testing has shown good results on some sets, and no real difference on others. It really ends up depending on the amount of elevation change in the overall image.

With compute_all_frames off (default):
elevation
result_canvas_caf0

With compute_all_frames on (optional):
elevation
result_canvas_caf1