HengyiWang/Co-SLAM

Custom Blender Data

HamzaOuajhain opened this issue · 5 comments

Hello, and thank you for this wonderful repository.

I have created this scene in Blender with a very simple object and a camera rotating arround it, I unfortunatly was not able to get good result. I have tried many configurations and different conventions in order to solve this issue.

image

blenderanimation-2024-07-03_15.43.43.mp4

For a total of 2001 frames, I have created a camera that will be able to output both depth and rgb information and a script that which is created to change the convention from Blender into OpenCV convention, in order to work the same way Replica is ran with Co-SLAM. I my Replica in order to create my own yaml file.
The main difference in the yaml file is :
cam: H: 480 W: 640 fx: 888.8889 fy: 1000.0000 cx: 320.0 cy: 240.0
and for the bound I chose :
mapping: bound: [[-5,5],[-5,5],[-5,5]] marching_cubes_bound: [[-5,5],[-5,5],[-5,5]]
This is the visualisation of the bounding box :
image

This is the resulting mesh from vis_bound :
image

Results :

image

We can conclude that the convention used is correct from the way the ground truth ATE RMSE is matching, the expected ground truth.

image

Any suggestion where things went wrong ?

From the result of vis_mesh, the camera convention seems to be wrong. We use OpenGL c2w, please double-check your camera pose convention (OpenGL or OpenCV, w2c or c2w).

I understand that, I have assumed that the Blender and OpenGL is the same. What I did is transform the convention from Blender -> OpenCV w2c and then when I run the vis_bound the same way it runs using Replica yaml files I get a better result but not the shape I want.

image

`inherit_from: configs/Thebox/box.yaml
mapping:
bound: [[-5,5],[-5,5],[-5,5]]
marching_cubes_bound: [[-5,5],[-5,5],[-5,5]]

data:
datadir: data/boxibox
trainskip: 1
output: output/result_boxibox
exp_name: demo`

`dataset: 'replica'

data:
downsample: 1
sc_factor: 1
translation: 0
num_workers: 4

mapping:
sample: 2048
first_mesh: True
iters: 10
cur_frame_iters: 0
lr_embed: 0.01
lr_decoder: 0.01
lr_rot: 0.001
lr_trans: 0.001
keyframe_every: 5
map_every: 5
n_pixels: 0.05
first_iters: 200
optim_cur: True
min_pixels_cur: 100
map_accum_step: 1
pose_accum_step: 5
map_wait_step: 0
filter_depth: False

tracking:
iter: 10
sample: 1024
pc_samples: 40960
lr_rot: 0.001
lr_trans: 0.001
ignore_edge_W: 20
ignore_edge_H: 20
iter_point: 0
wait_iters: 100
const_speed: True
best: True

grid:
enc: 'HashGrid'
tcnn_encoding: True
hash_size: 16
voxel_color: 0.08
voxel_sdf: 0.02
oneGrid: True

pos:
enc: 'OneBlob'
n_bins: 16

decoder:
geo_feat_dim: 15
hidden_dim: 32
num_layers: 2
num_layers_color: 2
hidden_dim_color: 32
tcnn_network: False

cam:
H: 480
W: 640
fx: 800.0000
fy: 800.0000
cx: 320.0
cy: 240.0
png_depth_scale: 6553.5 #for depth image in png format
crop_edge: 0
near: 0
far: 5
depth_trunc: 100.

training:
rgb_weight: 5.0
depth_weight: 0.1
sdf_weight: 1000
fs_weight: 10
eikonal_weight: 0
smooth_weight: 0.000001
smooth_pts: 32
smooth_vox: 0.1
smooth_margin: 0.05
#n_samples: 256
n_samples_d: 32
range_d: 0.1
n_range_d: 11
n_importance: 0
perturb: 1
white_bkgd: False
trunc: 0.1
rot_rep: 'axis_angle'
rgb_missing: 0.05

mesh:
resolution: 512
render_color: False
vis: 500
voxel_eval: 0.05
voxel_final: 0.02
visualisation: False
` This is how the two yaml files look like

Hi @HamzaOuajhain, the problem might be related to the scale of the depth i.e. png_depth_scale (1, 1000, or 6553.5).

Hello @HengyiWang, I have tried different configuration, for the depth scale. Still having the same issue. Any idea what can be the cause ?