[issue] RuntimeError: Trying to create tensor with negative dimension

Question

[issue] RuntimeError: Trying to create tensor with negative dimension

Opened this issue 15 days ago · 0 comments

I have a question of use nerfacc in threestudio.
The question is as fellows

RuntimeError: Trying to create tensor with negative dimension -34359738384: [-34359738384]

My environment is as fellows

OS: 22.04.1-Ubuntu
GPU: a6000
cudatoolkit: 12.1
pytorch:'2.3.0+cu121'

more output details log is as fellows:

(three-12.1) bunny@star-SYS-4029GP-TRT:/mnt/bunny/3DAIGC/official/threestudio$ bash ./bash/run_dreamfusion.bash                                                                                           
Seed set to 0                                                                                                                                                                                                      
[INFO] Using 16bit Automatic Mixed Precision (AMP)                                                                                                                                                                 
[INFO] GPU available: True (cuda), used: True                                                                                                                                                                      
[INFO] TPU available: False, using: 0 TPU cores                                                                                                                                                                    
[INFO] IPU available: False, using: 0 IPUs                                                                                                                                                                         
[INFO] HPU available: False, using: 0 HPUs                                                                                                                                                                         
[INFO] You are using a CUDA device ('NVIDIA RTX A6000') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for 
performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision                                                           
/mnt/bunny/3DAIGC/official/threestudio/threestudio/data/uncond.py:400: UserWarning: Using torch.cross without specifying the dim arg is deprecated.                                                                  
Please either pass the dim explicitly or simply use torch.linalg.cross.                                                                                                                                            
The default value of dim will change to agree with that of linalg.cross in a future release. (Triggered internally at ../aten/src/ATen/native/Cross.cpp:62.)                                                       
  right: Float[Tensor, "B 3"] = F.normalize(torch.cross(lookat, up), dim=-1)                                                                                                                                       
[INFO] LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]                                                                                                                                                                   
[INFO]                                                                                                                                                                                                             
  | Name       | Type                           | Params                                                                                                                                                           
--------------------------------------------------------------                                                                                                                                                     
0 | geometry   | ImplicitVolume                 | 12.6 M                                                                                                                                                           
1 | material   | DiffuseWithPointLightMaterial  | 0                                                                                                                                                                
2 | background | NeuralEnvironmentMapBackground | 448                                                                                                                                                              
3 | renderer   | NeRFVolumeRenderer             | 0                                                                                                                                                                
--------------------------------------------------------------                                                                                                                                                     
12.6 M    Trainable params                                                                                                                                                                                         
0         Non-trainable params                                                                                                                                                                                     
12.6 M    Total params                                                                                                                                                                                             
50.419    Total estimated model params size (MB)                                                                                                                                                                   
[INFO] Validation results will be saved to outputs/dreamfusion-sd/a_zoomed_out_DSLR_photo_of_a_baby_bunny_sitting_on_top_of_a_stack_of_pancakes@20240607-153418/save                                               
[INFO] Using prompt [a zoomed out DSLR photo of a baby bunny sitting on top of a stack of pancakes] and negative prompt []                                                                                         
[INFO] Using view-dependent prompts [side]:[a zoomed out DSLR photo of a baby bunny sitting on top of a stack of pancakes, side view] [front]:[a zoomed out DSLR photo of a baby bunny sitting on top of a stack of
 pancakes, front view] [back]:[a zoomed out DSLR photo of a baby bunny sitting on top of a stack of pancakes, back view] [overhead]:[a zoomed out DSLR photo of a baby bunny sitting on top of a stack of pancakes,
 overhead view]                                                                                                                                                                                                    
[INFO] Loading Stable Diffusion ...                                                                                                                                                                                
Loading pipeline components...: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:09<00:00,  2.29s/it]
[INFO] Loaded Stable Diffusion!                                                                                                                                                                                    
/home/bunny/anaconda3/envs/three-12.1/lib/python3.9/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:441: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider i
ncreasing the value of the `num_workers` argument` to `num_workers=103` in the `DataLoader` to improve performance.                                                                                                
/home/bunny/anaconda3/envs/three-12.1/lib/python3.9/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:441: The 'val_dataloader' does not have many workers which may be a bottleneck. Consider inc
reasing the value of the `num_workers` argument` to `num_workers=103` in the `DataLoader` to improve performance.
[WARNING] Empty rays_indices!
Traceback (most recent call last):
  File "/mnt/bunny/3DAIGC/official/threestudio/launch.py", line 301, in <module>
    main(args, extras)
  File "/mnt/bunny/3DAIGC/official/threestudio/launch.py", line 244, in main
    trainer.fit(system, datamodule=dm, ckpt_path=cfg.resume)
  File "/home/bunny/anaconda3/envs/three-12.1/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 544, in fit
    call._call_and_handle_interrupt(
  File "/home/bunny/anaconda3/envs/three-12.1/lib/python3.9/site-packages/pytorch_lightning/trainer/call.py", line 44, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/home/bunny/anaconda3/envs/three-12.1/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 580, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "/home/bunny/anaconda3/envs/three-12.1/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 987, in _run
    results = self._run_stage()
  File "/home/bunny/anaconda3/envs/three-12.1/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1033, in _run_stage
    self.fit_loop.run()
  File "/home/bunny/anaconda3/envs/three-12.1/lib/python3.9/site-packages/pytorch_lightning/loops/fit_loop.py", line 205, in run
    self.advance()
  File "/home/bunny/anaconda3/envs/three-12.1/lib/python3.9/site-packages/pytorch_lightning/loops/fit_loop.py", line 363, in advance
    self.epoch_loop.run(self._data_fetcher)
  File "/home/bunny/anaconda3/envs/three-12.1/lib/python3.9/site-packages/pytorch_lightning/loops/training_epoch_loop.py", line 140, in run


File "/home/bunny/anaconda3/envs/three-12.1/lib/python3.9/site-packages/pytorch_lightning/loops/training_epoch_loop.py", line 250, in advance                                                                      
    batch_output = self.automatic_optimization.run(trainer.optimizers[0], batch_idx, kwargs)
  File "/home/bunny/anaconda3/envs/three-12.1/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/automatic.py", line 190, in run
    self._optimizer_step(batch_idx, closure)
  File "/home/bunny/anaconda3/envs/three-12.1/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/automatic.py", line 268, in _optimizer_step
    call._call_lightning_module_hook(
  File "/home/bunny/anaconda3/envs/three-12.1/lib/python3.9/site-packages/pytorch_lightning/trainer/call.py", line 157, in _call_lightning_module_hook
    output = fn(*args, **kwargs)
  File "/home/bunny/anaconda3/envs/three-12.1/lib/python3.9/site-packages/pytorch_lightning/core/module.py", line 1303, in optimizer_step
    optimizer.step(closure=optimizer_closure)
  File "/home/bunny/anaconda3/envs/three-12.1/lib/python3.9/site-packages/pytorch_lightning/core/optimizer.py", line 152, in step
    step_output = self._strategy.optimizer_step(self._optimizer, closure, **kwargs)
  File "/home/bunny/anaconda3/envs/three-12.1/lib/python3.9/site-packages/pytorch_lightning/strategies/strategy.py", line 239, in optimizer_step
    return self.precision_plugin.optimizer_step(optimizer, model=model, closure=closure, **kwargs)
  File "/home/bunny/anaconda3/envs/three-12.1/lib/python3.9/site-packages/pytorch_lightning/plugins/precision/amp.py", line 80, in optimizer_step
    closure_result = closure()
  File "/home/bunny/anaconda3/envs/three-12.1/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/automatic.py", line 144, in __call__
    self._result = self.closure(*args, **kwargs)
  File "/home/bunny/anaconda3/envs/three-12.1/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/bunny/anaconda3/envs/three-12.1/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/automatic.py", line 129, in closure
    step_output = self._step_fn()
  File "/home/bunny/anaconda3/envs/three-12.1/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/automatic.py", line 318, in _training_step
    training_step_output = call._call_strategy_hook(trainer, "training_step", *kwargs.values())
  File "/home/bunny/anaconda3/envs/three-12.1/lib/python3.9/site-packages/pytorch_lightning/trainer/call.py", line 309, in _call_strategy_hook
    output = fn(*args, **kwargs)
  File "/home/bunny/anaconda3/envs/three-12.1/lib/python3.9/site-packages/pytorch_lightning/strategies/strategy.py", line 391, in training_step
    return self.lightning_module.training_step(*args, **kwargs)
  File "/mnt/bunny/3DAIGC/official/threestudio/threestudio/systems/dreamfusion.py", line 38, in training_step
    out = self(batch)
  File "/home/bunny/anaconda3/envs/three-12.1/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/bunny/anaconda3/envs/three-12.1/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/mnt/bunny/3DAIGC/official/threestudio/threestudio/systems/dreamfusion.py", line 24, in forward
    render_out = self.renderer(**batch)
  File "/home/bunny/anaconda3/envs/three-12.1/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/bunny/anaconda3/envs/three-12.1/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/mnt/bunny/3DAIGC/official/threestudio/threestudio/models/renderers/nerf_volume_renderer.py", line 170, in forward
    ray_indices, t_starts_, t_ends_ = self.estimator.sampling(
  File "/home/bunny/anaconda3/envs/three-12.1/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/bunny/anaconda3/envs/three-12.1/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/bunny/anaconda3/envs/three-12.1/lib/python3.9/site-packages/nerfacc/estimators/occ_grid.py", line 164, in sampling
    intervals, samples = traverse_grids(
  File "/home/bunny/anaconda3/envs/three-12.1/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/bunny/anaconda3/envs/three-12.1/lib/python3.9/site-packages/nerfacc/grid.py", line 135, in traverse_grids
    intervals, samples = _C.traverse_grids(
  File "/home/bunny/anaconda3/envs/three-12.1/lib/python3.9/site-packages/nerfacc/cuda/__init__.py", line 13, in call_cuda
    return getattr(_C, name)(*args, **kwargs)
RuntimeError: Trying to create tensor with negative dimension -34359738384: [-34359738384]

I also found a similar issue,but I don't find any solutions, Anyone can give me some suggestions, thanks!