broadinstitute/CellBender

OOM posterior inference for chimeric sample even using --posterior-batch-size 1

njbernstein opened this issue · 0 comments

Hi there,

I'm getting an OOM during posterior inference even when I set the --posterior-batch-size 1 . This is for a chimeric sample so the feature vector length is twice as long as usual. The gpu I'm on has 24GB of memory. Is there anything else you'd suggest I try to get cellbender to not OOM?

Error:

Traceback (most recent call last):
  File "/efs/prefect/conda/envs/cellbender-gpu/bin/cellbender", line 8, in <module>
    sys.exit(main())
  File "/efs/prefect/conda/envs/cellbender-gpu/lib/python3.7/site-packages/cellbender/base_cli.py", line 101, in main
    cli_dict[args.tool].run(args)
  File "/efs/prefect/conda/envs/cellbender-gpu/lib/python3.7/site-packages/cellbender/remove_background/cli.py", line 109, in run
    main(args)
  File "/efs/prefect/conda/envs/cellbender-gpu/lib/python3.7/site-packages/cellbender/remove_background/cli.py", line 204, in main
    run_remove_background(args)
  File "/efs/prefect/conda/envs/cellbender-gpu/lib/python3.7/site-packages/cellbender/remove_background/cli.py", line 174, in run_remove_background
    save_plots=True)
  File "/efs/prefect/conda/envs/cellbender-gpu/lib/python3.7/site-packages/cellbender/remove_background/data/dataset.py", line 534, in save_to_output_file
    inferred_count_matrix = self.posterior.mean
  File "/efs/prefect/conda/envs/cellbender-gpu/lib/python3.7/site-packages/cellbender/remove_background/infer.py", line 58, in mean
    self._get_mean()
  File "/efs/prefect/conda/envs/cellbender-gpu/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/efs/prefect/conda/envs/cellbender-gpu/lib/python3.7/site-packages/cellbender/remove_background/infer.py", line 372, in _get_mean
    alpha_est=map_est['alpha'])
  File "/efs/prefect/conda/envs/cellbender-gpu/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/efs/prefect/conda/envs/cellbender-gpu/lib/python3.7/site-packages/cellbender/remove_background/infer.py", line 717, in _lambda_binary_search_given_fpr
    alpha_est=alpha_est)
  File "/efs/prefect/conda/envs/cellbender-gpu/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/efs/prefect/conda/envs/cellbender-gpu/lib/python3.7/site-packages/cellbender/remove_background/infer.py", line 657, in _calculate_expected_fpr_given_lambda_mult
    alpha_est=alpha_est)
  File "/efs/prefect/conda/envs/cellbender-gpu/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/efs/prefect/conda/envs/cellbender-gpu/lib/python3.7/site-packages/cellbender/remove_background/infer.py", line 570, in _true_counts_from_params
    .log_prob(noise_count_tensor)
  File "/efs/prefect/conda/envs/cellbender-gpu/lib/python3.7/site-packages/pyro/distributions/torch.py", line 280, in log_prob
    return super().log_prob(value)
  File "/efs/prefect/conda/envs/cellbender-gpu/lib/python3.7/site-packages/torch/distributions/poisson.py", line 69, in log_prob
    return value.xlogy(rate) - rate - (value + 1).lgamma()
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 2.04 GiB (GPU 0; 21.96 GiB total capacity; 7.14 GiB already allocated; 1.44 GiB free; 7.22 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF