Error running on Windows
Opened this issue · 17 comments
Hi, I have compiled and am running the example :
D:\NERF\4K-NeRF-main\4K-NeRF-main>python run.py --config configs/llff/fern_lg_pretrain.py --render_test
And it gives me the error:
Using C:\Users\chemi\AppData\Local\torch_extensions\torch_extensions\Cache\py311_cu117 as PyTorch extensions root...
D:\Python\Lib\site-packages\torch\utils\cpp_extension.py:359: UserWarning: Error checking compiler version for cl: [WinError 2] The system cannot find the file specified
warnings.warn(f'Error checking compiler version for {compiler}: {error}')
Detected CUDA files, patching ldflags
Emitting ninja build file C:\Users\chemi\AppData\Local\torch_extensions\torch_extensions\Cache\py311_cu117\adam_upd_cuda\build.ninja...
INFO: Could not find files for the given pattern(s).
Traceback (most recent call last):
File "D:\NERF\4K-NeRF-main\4K-NeRF-main\run.py", line 11, in <module>
from lib import img_encoder, utils, dvgo, dcvgo, dmpigo, sr_esrnet, sr_unetdisc, dvqgo
File "D:\NERF\4K-NeRF-main\4K-NeRF-main\lib\utils.py", line 12, in <module>
from .masked_adam import MaskedAdam
File "D:\NERF\4K-NeRF-main\4K-NeRF-main\lib\masked_adam.py", line 7, in <module>
adam_upd_cuda = load(
^^^^^
File "D:\Python\Lib\site-packages\torch\utils\cpp_extension.py", line 1284, in load
return _jit_compile(
^^^^^^^^^^^^^
File "D:\Python\Lib\site-packages\torch\utils\cpp_extension.py", line 1509, in _jit_compile
_write_ninja_file_and_build_library(
File "D:\Python\Lib\site-packages\torch\utils\cpp_extension.py", line 1611, in _write_ninja_file_and_build_library
_write_ninja_file_to_build_library(
File "D:\Python\Lib\site-packages\torch\utils\cpp_extension.py", line 2048, in _write_ninja_file_to_build_library
_write_ninja_file(
File "D:\Python\Lib\site-packages\torch\utils\cpp_extension.py", line 2188, in _write_ninja_file
cl_paths = subprocess.check_output(['where',
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Python\Lib\subprocess.py", line 466, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Python\Lib\subprocess.py", line 571, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['where', 'cl']' returned non-zero exit status 1.
What am i doing wrong here?
I have installed ninja, and the build tools for every visual studio version. I am on Windows 11, with Pytorch 2.0, and CUDA 11.7.
Please help! Thank you
I seem to have resolved this be adding the path to 'cl'exe' to my system PATH env variables. In my case, it was:
C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.34.31933\bin\Hostx64\x64
However, now I have:
[3/3] "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.34.31933\bin\Hostx64\x64/link.exe" ub360_utils.o ub360_utils_kernel.cuda.o /nologo /DLL c10.lib c10_cuda.lib torch_cpu.lib torch_cuda_cu.lib -INCLUDE:?searchsorted_cuda@native@at@@YA?AVTensor@2@AEBV32@0_N1@Z torch_cuda_cpp.lib -INCLUDE:?warp_size@cuda@at@@YAHXZ torch.lib /LIBPATH:C:\Users\B\AppData\Roaming\Python\Python39\site-packages\torch\lib torch_python.lib "/LIBPATH:C:\Program Files\Python39\libs" "/LIBPATH:C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.7\lib/x64" cudart.lib /out:ub360_utils_cuda.pyd
Creating library ub360_utils_cuda.lib and object ub360_utils_cuda.exp
Loading extension module ub360_utils_cuda...
Traceback (most recent call last):
File "D:\NERF\4K-NeRF-main\4K-NeRF-main\run.py", line 11, in <module>
from lib import img_encoder, utils, dvgo, dcvgo, dmpigo, sr_esrnet, sr_unetdisc, dvqgo
ImportError: cannot import name 'img_encoder' from 'lib' (D:\NERF\4K-NeRF-main\4K-NeRF-main\lib\__init__.py)
However, now I have:
[3/3] "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.34.31933\bin\Hostx64\x64/link.exe" ub360_utils.o ub360_utils_kernel.cuda.o /nologo /DLL c10.lib c10_cuda.lib torch_cpu.lib torch_cuda_cu.lib -INCLUDE:?searchsorted_cuda@native@at@@YA?AVTensor@2@AEBV32@0_N1@Z torch_cuda_cpp.lib -INCLUDE:?warp_size@cuda@at@@YAHXZ torch.lib /LIBPATH:C:\Users\B\AppData\Roaming\Python\Python39\site-packages\torch\lib torch_python.lib "/LIBPATH:C:\Program Files\Python39\libs" "/LIBPATH:C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.7\lib/x64" cudart.lib /out:ub360_utils_cuda.pyd Creating library ub360_utils_cuda.lib and object ub360_utils_cuda.exp Loading extension module ub360_utils_cuda... Traceback (most recent call last): File "D:\NERF\4K-NeRF-main\4K-NeRF-main\run.py", line 11, in <module> from lib import img_encoder, utils, dvgo, dcvgo, dmpigo, sr_esrnet, sr_unetdisc, dvqgo ImportError: cannot import name 'img_encoder' from 'lib' (D:\NERF\4K-NeRF-main\4K-NeRF-main\lib\__init__.py)
I solve this problem by delete 'img_encoder'
However, now I have:
[3/3] "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.34.31933\bin\Hostx64\x64/link.exe" ub360_utils.o ub360_utils_kernel.cuda.o /nologo /DLL c10.lib c10_cuda.lib torch_cpu.lib torch_cuda_cu.lib -INCLUDE:?searchsorted_cuda@native@at@@YA?AVTensor@2@AEBV32@0_N1@Z torch_cuda_cpp.lib -INCLUDE:?warp_size@cuda@at@@YAHXZ torch.lib /LIBPATH:C:\Users\B\AppData\Roaming\Python\Python39\site-packages\torch\lib torch_python.lib "/LIBPATH:C:\Program Files\Python39\libs" "/LIBPATH:C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.7\lib/x64" cudart.lib /out:ub360_utils_cuda.pyd Creating library ub360_utils_cuda.lib and object ub360_utils_cuda.exp Loading extension module ub360_utils_cuda... Traceback (most recent call last): File "D:\NERF\4K-NeRF-main\4K-NeRF-main\run.py", line 11, in <module> from lib import img_encoder, utils, dvgo, dcvgo, dmpigo, sr_esrnet, sr_unetdisc, dvqgo ImportError: cannot import name 'img_encoder' from 'lib' (D:\NERF\4K-NeRF-main\4K-NeRF-main\lib\__init__.py)
by the way , what is your pytorch & cuda version?
Thank you! I hit one more error:
AttributeError: module 'mmcv' has no attribute 'Config'
Installing mmcv with this command:
pip3 install mmcv-full==1.3.15 -f https://download.openmmlab.com/mmcv/dist/cu111/torch2.0.0/index.html
has resolved it.
Training right now on the fern dataset!
One more question, to train on my own data, what inputs are required? Can i use a COLMAP file for camera positions etc?
(CUDA 11.7, Pytorch 2.0.0)
Actually, i still see problems... I have started again from scratch, to avoid any problems.
Pytorch 1.13, CUDA 11.7, Python 3.9
My steps:
pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117
pip install mmcv==1.7.1 -f https://download.openmmlab.com/mmcv/dist/cu117/torch1.13/index.html
pip install torch-scatter -f https://data.pyg.org/whl/torch-1.13.0+cu117.html
pip install -r requirements.txt
python run.py --config configs/llff/fern_lg_pretrain.py --render_test
//runs, but stops at:
Traceback (most recent call last):
File "D:\NERF\4K-NeRF-main\4K-NeRF-main\run.py", line 810, in <module>
imageio.imwrite(filename, rgb8) ##error here
File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\imageio\v2.py", line 263, in imwrite
with imopen(uri, "wi", **imopen_args) as file:
File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\imageio\core\imopen.py", line 113, in imopen
request = Request(uri, io_mode, format_hint=format_hint, extension=extension)
File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\imageio\core\request.py", line 247, in __init__
self._parse_uri(uri)
File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\imageio\core\request.py", line 412, in _parse_uri
raise FileNotFoundError("The directory %r does not exist" % dn)
FileNotFoundError: The directory 'D:\\NERF\\4K-NeRF-main\\4K-NeRF-main\\logs\\llff\\pretrain_fern_l1\\render_test_llff\\pretrain_fern_l1\\fine_last\\llff\\pretrain_fern_l1' does not exist
python run_sr.py --config configs/llff/fern_lg_joint_l1+gan.py --render_test --ftdv_path logs/llff/pretrain_fern_l1/fine_last.tar --ftsr_path ./pretrained/RealESRNet_x4plus.pth --test_tile 510
//Does not start, error:
Using C:\Users\B\AppData\Local\torch_extensions\torch_extensions\Cache\py39_cu117 as PyTorch extensions root...
Detected CUDA files, patching ldflags
Emitting ninja build file C:\Users\B\AppData\Local\torch_extensions\torch_extensions\Cache\py39_cu117\adam_upd_cuda\build.ninja...
Traceback (most recent call last):
File "D:\NERF\4K-NeRF-main\4K-NeRF-main\run_sr.py", line 12, in <module>
from lib import utils, dvgo, dcvgo, dmpigo, sr_esrnet, sr_unetdisc
File "D:\NERF\4K-NeRF-main\4K-NeRF-main\lib\utils.py", line 12, in <module>
from .masked_adam import MaskedAdam
File "D:\NERF\4K-NeRF-main\4K-NeRF-main\lib\masked_adam.py", line 7, in <module>
adam_upd_cuda = load(
File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\utils\cpp_extension.py", line 1284, in load
return _jit_compile(
File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\utils\cpp_extension.py", line 1508, in _jit_compile
_write_ninja_file_and_build_library(
File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\utils\cpp_extension.py", line 1610, in _write_ninja_file_and_build_library
_write_ninja_file_to_build_library(
File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\utils\cpp_extension.py", line 2014, in _write_ninja_file_to_build_library
cuda_flags = common_cflags + COMMON_NVCC_FLAGS + _get_cuda_arch_flags()
File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\utils\cpp_extension.py", line 1780, in _get_cuda_arch_flags
arch_list[-1] += '+PTX'
IndexError: list index out of range
Actually, i still see problems... I have started again from scratch, to avoid any problems.
Pytorch 1.13, CUDA 11.7, Python 3.9
My steps:
pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117
pip install mmcv==1.7.1 -f https://download.openmmlab.com/mmcv/dist/cu117/torch1.13/index.html
pip install torch-scatter -f https://data.pyg.org/whl/torch-1.13.0+cu117.html
pip install -r requirements.txt
python run.py --config configs/llff/fern_lg_pretrain.py --render_test //runs, but stops at:
Traceback (most recent call last): File "D:\NERF\4K-NeRF-main\4K-NeRF-main\run.py", line 810, in <module> imageio.imwrite(filename, rgb8) ##error here File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\imageio\v2.py", line 263, in imwrite with imopen(uri, "wi", **imopen_args) as file: File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\imageio\core\imopen.py", line 113, in imopen request = Request(uri, io_mode, format_hint=format_hint, extension=extension) File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\imageio\core\request.py", line 247, in __init__ self._parse_uri(uri) File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\imageio\core\request.py", line 412, in _parse_uri raise FileNotFoundError("The directory %r does not exist" % dn) FileNotFoundError: The directory 'D:\\NERF\\4K-NeRF-main\\4K-NeRF-main\\logs\\llff\\pretrain_fern_l1\\render_test_llff\\pretrain_fern_l1\\fine_last\\llff\\pretrain_fern_l1' does not exist
python run_sr.py --config configs/llff/fern_lg_joint_l1+gan.py --render_test --ftdv_path logs/llff/pretrain_fern_l1/fine_last.tar --ftsr_path ./pretrained/RealESRNet_x4plus.pth --test_tile 510
//Does not start, error:
Using C:\Users\B\AppData\Local\torch_extensions\torch_extensions\Cache\py39_cu117 as PyTorch extensions root... Detected CUDA files, patching ldflags Emitting ninja build file C:\Users\B\AppData\Local\torch_extensions\torch_extensions\Cache\py39_cu117\adam_upd_cuda\build.ninja... Traceback (most recent call last): File "D:\NERF\4K-NeRF-main\4K-NeRF-main\run_sr.py", line 12, in <module> from lib import utils, dvgo, dcvgo, dmpigo, sr_esrnet, sr_unetdisc File "D:\NERF\4K-NeRF-main\4K-NeRF-main\lib\utils.py", line 12, in <module> from .masked_adam import MaskedAdam File "D:\NERF\4K-NeRF-main\4K-NeRF-main\lib\masked_adam.py", line 7, in <module> adam_upd_cuda = load( File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\utils\cpp_extension.py", line 1284, in load return _jit_compile( File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\utils\cpp_extension.py", line 1508, in _jit_compile _write_ninja_file_and_build_library( File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\utils\cpp_extension.py", line 1610, in _write_ninja_file_and_build_library _write_ninja_file_to_build_library( File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\utils\cpp_extension.py", line 2014, in _write_ninja_file_to_build_library cuda_flags = common_cflags + COMMON_NVCC_FLAGS + _get_cuda_arch_flags() File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\utils\cpp_extension.py", line 1780, in _get_cuda_arch_flags arch_list[-1] += '+PTX' IndexError: list index out of range
I didn't have the first problem. But I, like you, have the second problem.
Ah okay! I am out of ideas for that one. It seems like a CUDA problem with mmcv, but I have reinstalled everything from scratch and I still see the problem. i have even tried mmcv with cpu only.
Let me know if you have any ideas!
@frozoul sorry to bug you, do you have any thoughts on what we might be hitting here? Would it be possible to share the details of your Pytorch, mmcv and CUDA versions? Thank you!
@XLR-man it looks like this is an issue based around pytorch compatibility with a GPU.
but I get the correct CC when I do:
>>> torch.cuda.get_arch_list()
['sm_86']
@XLR-man I have it running further by simply commenting out the line...
line 1780 in C:\Users\B\AppData\Local\Programs\Python\Python39\Lib\site-packages\torch\utils\cpp_extension.py
# arch_list[-1] += '+PTX'
I also had to change 'cp' to 'copy' on line 69 of:
\lib\load_llff.py
check_output('copy {}/* {}'.format(imgdir_orig, imgdir), shell=True)
Now i am hitting a new error:
Loading images from ./datasets/nerf_llff_data/fern\images_1008x756
Mismatch between imgs 0 and poses 20 !!!!
Traceback (most recent call last):
File "D:\NERF\4K-NeRF-main\4K-NeRF-main\run_sr.py", line 1256, in <module>
data_dict = load_everything(args=args, cfg=cfg)
File "D:\NERF\4K-NeRF-main\4K-NeRF-main\run_sr.py", line 198, in load_everything
data_dict = load_data(cfg.data)
File "D:\NERF\4K-NeRF-main\4K-NeRF-main\lib\load_data.py", line 19, in load_data
images, depths, poses, bds, render_poses, i_test, *srgt = load_llff_data(
File "D:\NERF\4K-NeRF-main\4K-NeRF-main\lib\load_llff.py", line 340, in load_llff_data
poses, bds, imgs, *depths = _load_data(basedir, factor=factor, width=width, height=height,
File "D:\NERF\4K-NeRF-main\4K-NeRF-main\lib\load_llff.py", line 131, in _load_data
names = set(name[:-4] for name in np.load(os.path.join(basedir, 'poses_names.npy')))
File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\numpy\lib\npyio.py", line 405, in load
fid = stack.enter_context(open(os_fspath(file), "rb"))
FileNotFoundError: [Errno 2] No such file or directory: './datasets/nerf_llff_data/fern\\poses_names.npy'
it looks like this file is supposed to be stored in the previous step, but that is not happening...
When I annotate # arch_list[-1] += '+PTX', I will report an error:
Traceback (most recent call last):
File "run_sr.py", line 12, in <module>
from lib import utils, dvgo, dcvgo, dmpigo, sr_esrnet, sr_unetdisc
File "/root/autodl-tmp/4K-NeRF-main/lib/utils.py", line 12, in <module>
from .masked_adam import MaskedAdam
File "/root/autodl-tmp/4K-NeRF-main/lib/masked_adam.py", line 7, in <module>
adam_upd_cuda = load(
File "/root/miniconda3/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1124, in load
return _jit_compile(
File "/root/miniconda3/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1337, in _jit_compile
_write_ninja_file_and_build_library(
File "/root/miniconda3/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1449, in _write_ninja_file_and_build_library
_run_ninja_build(
File "/root/miniconda3/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1733, in _run_ninja_build
raise RuntimeError(message) from e
RuntimeError: Error building extension 'adam_upd_cuda'
How dou you do it?
And you need to create folder 'datasets' firstly, Put in the downloaded datasets.If you haven't downloaded it, you can read it in readme
When I annotate # arch_list[-1] += '+PTX',I will report an error:
get_training_rays: start
get_training_rays: finish (eps time: 39.950878620147705 sec)
0%| | 0/270000 [00:16<?, ?it/s]
Traceback (most recent call last):
File "run_sr.py", line 1290, in <module>
train(args, cfg, data_dict)
File "run_sr.py", line 1219, in train
scene_rep_reconstruction_sr_patch(
File "run_sr.py", line 831, in scene_rep_reconstruction_sr_patch
target_4x = rgb_srgt_train[sel_b, sel_r_4x, sel_c_4x, :]
IndexError: index 3498 is out of bounds for dimension 0 with size 3024
When I annotate # arch_list[-1] += '+PTX',I will report an error:
get_training_rays: start get_training_rays: finish (eps time: 39.950878620147705 sec) 0%| | 0/270000 [00:16<?, ?it/s] Traceback (most recent call last): File "run_sr.py", line 1290, in <module> train(args, cfg, data_dict) File "run_sr.py", line 1219, in train scene_rep_reconstruction_sr_patch( File "run_sr.py", line 831, in scene_rep_reconstruction_sr_patch target_4x = rgb_srgt_train[sel_b, sel_r_4x, sel_c_4x, :] IndexError: index 3498 is out of bounds for dimension 0 with size 3024
I did not have that error, sorry! That is an out of range array exception, but I am not sure what it would be from.
Mine stops at
FileNotFoundError: [Errno 2] No such file or directory: './datasets/nerf_llff_data/fern\\poses_names.npy'
That 'poses_names.npy'' is not in the dataset, do you have it?
When I annotate # arch_list[-1] += '+PTX',I will report an error:
get_training_rays: start get_training_rays: finish (eps time: 39.950878620147705 sec) 0%| | 0/270000 [00:16<?, ?it/s] Traceback (most recent call last): File "run_sr.py", line 1290, in <module> train(args, cfg, data_dict) File "run_sr.py", line 1219, in train scene_rep_reconstruction_sr_patch( File "run_sr.py", line 831, in scene_rep_reconstruction_sr_patch target_4x = rgb_srgt_train[sel_b, sel_r_4x, sel_c_4x, :] IndexError: index 3498 is out of bounds for dimension 0 with size 3024
I have this error, did you manage to solve it?
pip install torch-scatter -f https://data.pyg.org/whl/torch-1.13.0+cu117.html
You may need to comment the 3rd in run_sr.py
os.environ["CUDA_VISIBLE_DEVICES"]="1"
or change the number "1" to "0".