torch and cuda version of your work

Question

torch and cuda version of your work

zhangsongdmk opened this issue 2 years ago · 7 comments

Thank you for you great work. As far as I know you are the only one work on high resolution NeRF.

I try to run the 'traing 4K resolution with L1 loss' according to the readme.md. But get the following error .
Some times it's caused by pytorch&cuda version. So can I know the pytorch&cuda version infomation?

'
File "/opt/anaconda3/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1561, in _get_cuda_arch_flags
arch_list[-1] += '+PTX'
IndexError: list index out of range
but get "arch_list[-1] += '+PTX'" error

Answer 1 · 2023-03-17T09:17:44.000Z

The error was called in the train.py & train_sr.py, os.environ[‘CUDA_VISIBLE_DEVICES‘] was set to 1, 6, not 0, if only 1 GPU was used, the code could not find GPU.

Answer 2 · 2023-04-16T16:39:02.000Z

I had the same problem. Did you solve it?

Answer 3 · 2023-04-16T16:40:09.000Z

Using /root/.cache/torch_extensions/py38_cu113 as PyTorch extensions root...
Detected CUDA files, patching ldflags
Emitting ninja build file /root/.cache/torch_extensions/py38_cu113/adam_upd_cuda/build.ninja...
Traceback (most recent call last):
File "run_sr.py", line 12, in
from lib import utils, dvgo, dcvgo, dmpigo, sr_esrnet, sr_unetdisc
File "/root/autodl-tmp/4K-NeRF-main/lib/utils.py", line 12, in
from .masked_adam import MaskedAdam
File "/root/autodl-tmp/4K-NeRF-main/lib/masked_adam.py", line 7, in
adam_upd_cuda = load(
File "/root/miniconda3/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1124, in load
return _jit_compile(
File "/root/miniconda3/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1337, in _jit_compile
_write_ninja_file_and_build_library(
File "/root/miniconda3/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1436, in _write_ninja_file_and_build_library
_write_ninja_file_to_build_library(
File "/root/miniconda3/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1834, in _write_ninja_file_to_build_library
cuda_flags = common_cflags + COMMON_NVCC_FLAGS + _get_cuda_arch_flags()
File "/root/miniconda3/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1606, in _get_cuda_arch_flags
arch_list[-1] += '+PTX'
IndexError: list index out of range

Answer 4 · 2023-04-17T13:54:18.000Z

The error was called in the train.py & train_sr.py, os.environ[‘CUDA_VISIBLE_DEVICES‘] was set to 1, 6, not 0, if only 1 GPU was used, the code could not find GPU.

But there are other problems left and I didn't run the code successfully.

Answer 5 · 2023-04-17T14:16:04.000Z

I didn't solve it either. I only have one gpu and my pytorch and cuda versions are 1.10 and 11.3 respectively. I think we might need a lower version

Answer 6 · 2023-09-14T11:13:34.000Z

try like export TORCH_CUDA_ARCH_LIST="8.6"

Answer 7 · 2023-09-14T23:16:01.000Z

there is some config part in the code to set which number gpu to use. you change the number to your computer.

…

---- Replied Message ---- | From | ***@***.***> | | Date | 09/14/2023 19:13 | | To | ***@***.***> | | Cc | ***@***.***>***@***.***> | | Subject | Re: [frozoul/4K-NeRF] torch and cuda version of your work (Issue #11) | try like export TORCH_CUDA_ARCH_LIST="8.6" — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: ***@***.***>