GPU OOMs

Question

GPU OOMs

Closed this issue 4 months ago · 3 comments

Training on ~600 images with 1080p images leads to OOM during training.
I reduced the last scale to be 2 instead of 1 during training to make it fit. This needs to be fixed.
However, after that I get OOM during the poisson meshing:

File "render.py", line 106, in render_sets
    render_set(dataset.model_path, True, "train", scene.loaded_iter, scene.getTrainCameras(scales[0]), gaussians, pipeline, background, write_image, poisson_depth)
  File "render.py", line 88, in render_set
    poisson_mesh(mesh_path, resampled[:, :3], resampled[:, 3:6], resampled[:, 6:], poisson_depth, 1 * 1e-4)
  File "./gaussian_surfels/utils/general_utils.py", line 234, in poisson_mesh
    nn_dist, nn_idx, _ = knn_points(torch.from_numpy(vert).to(torch.float32).cuda()[None], vtx.cuda()[None], K=4)
RuntimeError: CUDA out of memory. Tried to allocate 14.17 GiB (GPU 0; 44.38 GiB total capacity; 39.34 GiB already allocated; 1.34 GiB free; 41.92 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation

Update:
Reducing the scale in the render script makes it work, still wondering how it is possible to reduce the usage.

A side note: It would be great if you could share the differences against the original GS repo.

Answer 1 · 2024-05-28T04:44:58.000Z

hi, this OOM seems happen in the knn_points() function when applying Poisson meshing. Due to 600 images are used, the number resampled points for surface reconstruction will be very huge. The possible solution is to neglect some of the views when resample points by skipping this line for some (e.g. 2/3) of the views.

I am sorry that this method was originally developed for small scale object reconstruction. Im afraid that high memory consumption when scaling up the training setting is indeed a problem of it.

Answer 2 · 2024-05-29T05:24:07.000Z

@turandai Thanks a lot for responding.
Can you please share the key changes that you made w.r.t. the 3DGS repository? There might be other optimized codebases where this might be useful.

Answer 3 · 2024-06-14T12:43:51.000Z

Hi, detail changes are illustrated in the paper, feel free to check them out. To sum up, 1) we flattened the z-scale of 3DGS to surfels with explicit normal and depth (ray-surfel intersection), then cull back-faced ones during rendering and optimization, 2) we introduces geometry constraints to the model by adding multiple regularization terms