About GPU Memories reported in README.md
Closed this issue · 11 comments
Hi, I would like to ask about the GPU memories reported in Line 118 of 01672dc, which mentions a memory requirement of 23.5GB. Does this refer to the total memory cost for the entire process (including the DUSt3R), or is it only for the video diffusion model?
I attempted to run the ViewCrafter_25 model on a GPU with 48 GB of memory, but encountered a "CUDA out of memory" error. I'd like to confirm if this behavior is expected.
Thank you for your help!
Hi, 23.5GB is only for the video diffusion model. Are you processing more than 30 input images using DUSt3R? It will raise an OOM error.
Thank you for the quick response! No, I was simply running the provided inference script: "sh run.sh". The initial point cloud has been successfully obtained and saved, but a CUDA OOM error occurred during the forward pass of video diffusion. Does it have to be run on an 80G GPU like A100?
Also, is there an official inference script for the smaller models like ViewCrafter_25_512?
Thanks again and congrats on your great work!
Thanks! It doesn't have to run on a 80G A100; we conducted all the experiments on a single 40G A100 / 32G V100 GPU, and didn't encountered any OOM error. Theoretically, if you run run.sh without modifying any parameters, it should not raise an OOM error. Are you concurrently running other procedures on your GPU machine? BTW, to run the ViewCrafter_25_512 model, you should modify --height 576
--width 1024
--config configs/inference_pvd_1024.yaml
in run.sh to --height 320
--width 512
--config configs/inference_pvd_512.yaml
, and change the hard coded resolution in
Line 147 in 01672dc
ViewCrafter/utils/pvd_utils.py
Line 533 in 01672dc
in to 320x512
No, I wasn't concurrently running other procedures or modifying any parameters. The information you provided has been very helpful, I think I need to investigate further. I'll provide feedback if I discover anything. Thank you!
谢谢!它不必在 80G A100 上运行;我们在单个 40G A100 / 32G V100 GPU 上进行了所有实验,没有遇到任何 OOM 错误。理论上,如果您在不修改任何参数的情况下运行 run.sh,它不应该引发 OOM 错误。您是否同时在 GPU 机器上运行其他程序?顺便说一句,要运行 ViewCrafter_25_512 模型,您应该将
--height 576
--width 1024
--config configs/inference_pvd_1024.yaml
run.sh 中的修改为--height 320
--width 512
--config configs/inference_pvd_512.yaml
,并将 中的硬编码分辨率更改为Line 147 in 01672dc
ViewCrafter/utils/pvd_utils.py
Line 533 in 01672dc
改为 320x512
It cannot be run directly, and img_ori needs to be resized in line 331 of viewcrafter
Running a 320x512 input with 24GB of memory still results in an OOM error."
i don't know the reason, but
python 3.11 + pytorch 2.4.0 + cuda 12.1 -> ok
python 3.10 + pytorch 2.4.0 + cuda 11.8 -> OOM
i'll share here if i figure out why
i don't know the reason, but python 3.11 + pytorch 2.4.0 + cuda 12.1 -> ok python 3.10 + pytorch 2.4.0 + cuda 11.8 -> OOM
i'll share here if i figure out why
Cool! It seems that the OOM is indeed caused by the version difference, although my results is different from yours. In my case,
python 3.11 + pytorch 2.1.0 + cuda 11.8 -> OK
python 3.12 + pytorch 2.4.1 + cuda 12.1 -> OOM
In my case, CUDA OOM happened because pip install -r requirements.txt
failed to install xformers due to some version mismatch. The codebase seemed to be compatiable without xformers but this will lead to much larger GPU memory consumption -> OOM on A100 (40GB) on my workstation. I fix this right after installing the correct xformers version. The memory requirement for ViewCrafter_25 on my computer is 23.3GB.
Managed to make it work on two different environments:
- python 3.9 + pytorch 2.0.1 + cuda11.7
- python 3.10 + pytorch 2.1.2 + cuda 12.1
In my case, CUDA OOM happened because
pip install -r requirements.txt
failed to install xformers due to some version mismatch. The codebase seemed to be compatiable without xformers but this will lead to much larger GPU memory consumption -> OOM on A100 (40GB) on my workstation. I fix this right after installing the correct xformers version. The memory requirement for ViewCrafter_25 on my computer is 23.3GB.Managed to make it work on two different environments:
- python 3.9 + pytorch 2.0.1 + cuda11.7
- python 3.10 + pytorch 2.1.2 + cuda 12.1
Thank you for sharing! I checked my environments and as you said:
- python 3.11 + pytorch 2.1.0 + cuda 11.8 (xformers==0.0.22.post7+cu118) -> OK
- python 3.12 + pytorch 2.4.1 + cuda 12.1 (without xformers) -> OOM
Your explanation resolves the problem very well, I think I can close this issue now.
就我而言,CUDA OOM 的发生是因为
pip install -r requirements.txt
由于某些版本不匹配而无法安装 xformers。代码库似乎在没有 xformers 的情况下兼容,但这会导致更大的 GPU 内存消耗 -> 我的工作站上的 A100 (40GB) 上出现 OOM。我在安装正确的 xformers 版本后立即修复了这个问题。我的电脑上 ViewCrafter_25 的内存要求是 23.3GB。设法使其在两种不同的环境中工作:
- 蟒蛇3.9 + pytorch 2.0.1 + cuda11.7
- 蟒蛇3.10 + pytorch 2.1.2 + cuda 12.1
就我而言,CUDA OOM 的发生是因为
pip install -r requirements.txt
由于某些版本不匹配而无法安装 xformers。代码库似乎在没有 xformers 的情况下兼容,但这会导致更大的 GPU 内存消耗 -> 我的工作站上的 A100 (40GB) 上出现 OOM。我在安装正确的 xformers 版本后立即修复了这个问题。我的电脑上 ViewCrafter_25 的内存要求是 23.3GB。设法使其在两种不同的环境中工作:
- 蟒蛇3.9 + pytorch 2.0.1 + cuda11.7
- 蟒蛇3.10 + pytorch 2.1.2 + cuda 12.1
Installing matching xformers version solved my problem, currently torch=2.3.0,cuda=121,xformers=0.0.26.post